Data Engineer R&D (hybrid) (VAC-D2343)

Full Time @StaffMatters Recruitment Specialists in IT
  • Post Date : April 4, 2024
  • Apply Before : April 19, 2024
  • Salary: Negotiable
  • 0 Application(s)
  • View(s) 61
Email Job
  • Share:

Job Detail

  • Job ID 25130

Job Description

Our client, a Cybersecurity Company in Nicosia, is looking to hire an experienced Data Engineer working with large language models (LLMs) to join the Research and Development team. This role is crucial for developing and maintaining scalable data pipelines and infrastructure to support the training and deployment of large language models. The ideal candidate will bring a blend of data engineering skills and a deep understanding of the intricacies involved in managing data for LLMs and other advanced modelling from preprocessing to optimization for performance at scale.

Design, build, and maintain scalable and efficient data pipelines specifically tailored for training and deploying large language models.
Work closely with data scientists and machine learning engineers to understand data requirements for LLM projects, including data collection, processing, and storage needs
Implement and manage data ingestion routines from a variety of sources, ensuring data quality and accessibility for LLM training
Optimize data infrastructure to support the computational demands of LLMs, including performance tuning and scalability improvements
Develop tools and processes for monitoring and analyzing data pipeline performance and data quality, ensuring the integrity and availability of data
Collaborate with cross-functional teams to ensure seamless integration of LLMs into production environments, including support for model versioning, deployment, and monitoring
Stay abreast of the latest developments in large language models, data engineering practices, and technologies to continually improve pipeline efficiency and model performance
Ensure compliance with data governance and security policies throughout the data lifecycle, from ingestion to model deployment.

At least 2 years of proven experience as a Data Engineer, with specific experience working on projects involving large language models
Strong expertise in data modelling, ETL processes, and data pipeline tools
Proficient in programming languages commonly used in data engineering and machine learning, such as Python and SQL.
Experience with big data technologies (e.g., Hadoop, Spark) and cloud services (AWS, Google Cloud, Azure) tailored for machine learning and data processing workloads
Knowledge of containerization and orchestration technologies (e.g., Docker, Kubernetes) for deploying and managing LLM applications
Familiarity with machine learning operations (MLOps) practices for managing the lifecycle of machine learning models, including large language models
Excellent problem-solving skills, with the ability to work independently and as part of a team in a fast-paced environment
Strong communication skills, with the ability to explain complex technical concepts to non-technical stakeholders.
Fluency in Greek and English

Working hours:
The working hours are 9am-6pm (20 min break), Friday afternoons off (hybrid working)

TO APPLY for this job opportunity, send your CV (in English please) to [email protected] and include the reference:  Data Engineer R&D (hybrid) – VAC-D2343. We look forward to hearing from you!


Other jobs you may like