Codepan GmbH
Codepan GmbH

Data Engineer (m/f/d)

Remote (India)
Employee
Data Processing, Data Engineer

Codepan – founded in 2014 – is a Berlin-based AI Innovation Hub. Our team of passionate data scientists, engineers, and technologists applies machine learning to solve real-world problems for clients as well as to incubate and accelerate our own AI product ideas.

Codepan is currently developing an AI-based product using capabilities of state-of-the-art LLM technologies in the space of Intelligent Document processing.

Tasks

As a Data Engineer in our team, you'll architect, build, and maintain advanced data pipelines and storage solutions. You'll play a pivotal role in enabling our analytics and AI teams to work efficiently with large datasets, including those used for training and deploying Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) models.

Key Responsibilities:

  • Design and optimize scalable data pipelines to support advanced analytics, machine learning, and AI projects, with a particular focus on applications involving LLMs and RAGs.
  • Develop robust data warehousing solutions that ensure fast, reliable access to large volumes of data, optimizing for query performance and system scalability.
  • Collaborate with AI research and development teams to understand data requirements and ensure the seamless integration of AI models with our data ecosystem.
  • Implement data governance and security measures, adhering to best practices and regulatory standards to safeguard sensitive information.
  • Utilize and advocate for cloud-based technologies and services to enhance our data processing capabilities, ensuring our infrastructure is both flexible and cost-effective.
  • Regularly evaluate and adopt new tools and technologies to keep our data infrastructure at the forefront of industry standards, particularly those enhancing LLM and RAG functionalities.
  • Simplify complex data flows, making data easily accessible for non-technical stakeholders while maintaining the integrity and confidentiality of the data.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or a related field.
  • At least 5+ years of proven experience in data engineering, with a track record of developing scalable data solutions.
  • Strong technical expertise in SQL/NoSQL databases, Python, Java, and ETL processes.
  • Experience with cloud platforms (Azure/GCP) and familiarity with big data technologies.
  • A keen interest in AI technologies, especially LLMs and RAG models, with a desire to stay updated on the latest trends and techniques.
  • Solid foundation in data security principles and a commitment to implementing privacy-compliant data management practices.
  • Excellent problem-solving skills, ability to work collaboratively in a team environment, and strong communication skills for explaining complex technical concepts.

Join us to contribute to cutting-edge projects in AI and analytics, leveraging your expertise to create impactful data solutions.

Salary: 30L - 35L, remote job in India

Updated: 1 week ago
Job ID: 11159198
Report issue

Codepan GmbH

11-50 employees
Technology, Information and Internet

Codepan is a boutique agency focused on building innovative, AI-driven products & solutions for clients.

+

2 more

  1. Data Engineer (m/f/d)