About the job Vector Database Engineer
OUR CLIENT
Our client provides data-driven, action-oriented solutions to business problems through statistical data mining, cutting-edge analytics techniques, and a consultative approach. Leveraging proprietary methodology and best-of-breed technology, our client's analytics team takes an industry-specific approach to transform decision-making and embed analytics more deeply into their business processes. They have a global footprint of 2,000+ data scientists and analysts who assist client organizations with complex risk minimization methods, advanced marketing, pricing and CRM strategies, internal cost analysis, and cost and resource optimization within the organization. They serve the insurance, healthcare, banking, capital markets, utilities, retail and e-commerce, travel, transportation and logistics industries.
ROLE
We are seeking (2) Vector Database Engineers with experience in designing, implementing, and optimizing vector databases related to Large Language Models (LLMs). The candidate should have strong experience in fundamental data engineering concepts automated pipeline build, CI/ CD using Azure cloud. Candidate should be able to communicate technical concepts effectively to both technical and non-technical stakeholders
RESPONSIBILITIES
- 4.5+ years of experience in deploying data engineering solutons.
- 3+ years of Data Engineering experience comprising of - database solutions, CI/CD pipelines, automated deployment/ testing in Azure environment (Azure cloud, DevOps, Azure Data Factory (ADF) using Python, Pyspark or Spark-SQL)
- 6-12 months of Vector Database experience comprising implementing databases like ChromaDB/Pinecone/FAISS, embedding, vectorization, performance tuning, indexing strategies, front-end integration, query optimization, efficient retrieval
- Collaborate with application developers to integrate vector databases into various front-end UI solutions
- Implement and maintain data security measures for vector databases
- Ensure compliance with relevant data protection regulations and industry standards
- Work closely with cross-functional teams, including data scientists, software engineers, and product managers
TECHNICAL SKILLS
- Experienced in deploying data engineering solutions - database solutions, CI/CD pipelines, automated deployment/ testing
- Proficient in data manipulation on Azure (Azure Cloud, Azure Data Factory) using Python, Pyspark, Spark-SQL, Java or Scala
- Experienced in developing and implementing vector databases - ChromaDB/Pinecone/FAISS
- Experienced with vector database concepts - embedding, vectorization, performance tuning, indexing strategies, front-end integration
- Familiarity with AI, ML concepts - retrieval algorithms (RAG), NLP, LLM frameworks like Langchain
- Knowledge of distributed database systems
CANDIDATE PROFILE
- Bachelors/Master's degree in economics, mathematics, computer science/engineering, operations research or related analytics areas; candidates with BA/BS degrees in the same fields from the top tier academic institutions are also welcome to apply
- Outstanding written and verbal communication skills
- Superior analytical and problem-solving skills
- Experience in working in dual shore engagement is preferred
- Must have experience in managing clients directly
- Strong record of achievement, solid analytical ability, and an entrepreneurial hands-on approach to work
LOCATION
Candidate can be based anywhere in US or Canada