About the job .NET Web Crawler
Company Description
Our client is a leading Norwegian-headquartered software company specializing in managing and maintaining safety data sheets (SDS) for a vast array of industries. Their database holds over 14 million safety data sheets, which require constant updates from various manufacturers. To streamline this process, they rely on cutting-edge crawler technology to ensure that their database is always up to date with the latest information.
About The Role
As a .NET Web Crawler, you will be responsible for creating and maintaining automated web crawlers that check manufacturers' websites for updated versions of safety data sheets. Your work will directly contribute to the accuracy and efficiency of our client's data management process, ensuring that they always have the latest safety information for their clients.
Key Responsibilities
- Develop and maintain high-performance .NET-based crawler and spider bots to search manufacturers' websites for updated safety data sheets (SDS).
- Build systems to track, identify, and mark updated SDS versions within the database.
- Automate the download and replacement of outdated SDS files with the latest versions.
- Ensure that crawlers are efficient, scalable, and resilient to website structure changes.
- Implement robust error handling, retry mechanisms, and logging for monitoring crawler performance.
- Collaborate with cross-functional teams (e.g., data management, IT, and QA) to understand requirements and improve data integrity.
- Optimize crawler performance to handle large-scale web scraping tasks (millions of SDS records).
- Stay up-to-date with industry best practices for web scraping, crawling, and .NET technologies.
Skills & Qualifications
- Minimum of 3 years of relevant work experience.
- Bachelor's Degree in Software Engineering or related field or relevant work experience.
- Strong experience with .NET and C# development.
- Experience crawlers or spider bots is highly desirable. If you don't have experience with crawlers or spider bots, our client is willing to train you, but you must be strongly interested in learning web crawling.
- Knowledge of web scraping, HTTP protocols, and web technologies (HTML, CSS, JavaScript, etc.).
- Proficiency in working with APIs and data storage solutions (e.g., databases, file systems, cloud storage).
- Strong communication skills and ability to work independently or as part of a team.
- Experience with multi-threading, asynchronous programming (async/await), and parallelism in .NET for efficient data processing.
- Experience with Selenium or other tools for handling dynamic web content.
- Ability to write clean, maintainable, and scalable code.
- Strong debugging, problem-solving, and troubleshooting skills.
- Knowledge of best practices for handling large datasets and optimizing performance.
- Experience with cloud platforms like Azure or AWS for distributed crawling and storage.
- Experience with version control systems (e.g., Git).
- Familiarity with libraries such as HtmlAgilityPack, AngleSharp, or similar for HTML parsing and web scraping in .NET.
- Familiarity with working in an environment handling a large number of records or large-scale web crawling.
- Familiarity with safety data sheets (SDS) or regulatory compliance standards is a plus.
Employment Structure
- Hybrid in Dhaka | Full-time
- Salary: BDT 80,000 - 120,000
- Benefits: 2 Annual Bonuses after permanent (probation is 3-6 months)
- Work Week: Monday - Friday, 10 am to 6 pm BST
Hiring Process
1. Conversation with Talvette
2. Home-based technical assignment
3. Interview with the client's management team
5. Receive an offer
6. Join their team full-time