Python Web Scraping Engineer (Telangana)
Python Web Scraping Engineer (Telangana)
-
Telangana, India
-
Posted: less than a week ago
-
Save
Description
Role Summary We are looking for a hands-on Python Web Scraping and Automation Engineer to build, maintain, and support enterprise-grade crawling, data extraction, validation, and integration workflows. The role involves working with Scrapy, BeautifulSoup, Selenium, Playwright, SOAP APIs, Docker, Git/GitHub, and Zyte-based production deployments. The candidate should be comfortable handling pagination, energetic websites, retries, logging, data validation, error handling, and production issue debugging. The engineer will also support workflow automation using tools such as Microsoft Power Automate and collaborate with project teams to improve operational reliability and reduce manual effort. Key Responsibilities
- Build and maintain web scraping workflows using Python and Scrapy.
- Extract structured data from static and dynamic websites.
- Handle pagination, sessions, retries, throttling, duplicate records, and missing data.
- Use Selenium or Playwright where browser automation is required.
- Integrate with enterprise systems using SOAP APIs.
- Package and deploy automation workloads using Docker.
- Operate and support scraping workloads in Zyte.
- Implement logging, error handling, traceability, and root-cause analysis.
- Maintain code using Git/GitHub best practices.
- Support data validation, reconciliation, and production issue resolution. Must-Have Technical Skills
- Python: Build scraping scripts, automation utilities, validation checks, logging, retries, and error-handling logic.
- Scrapy: Create scalable spiders, manage pagination, pipelines, middlewares, throttling, and structured outputs.
- BeautifulSoup: Parse HTML/XML pages and extract clean data from irregular or nested page structures.
- Selenium: Automate browser-based workflows where UI interaction or verification is required.
- Playwright: Handle modern JavaScript-heavy websites with reliable browser automation.
- SOAP API: Integrate with enterprise services to send, receive, and validate structured payloads.
- Docker: Package scraping and automation workloads for consistent execution across environments.
- Git/GitHub: Manage source code through branches, commits, pull requests, reviews, and version control.
- Logging and Error Handling: Track failures, retries, exceptions, and production issues clearly.
- Production Debugging: Investigate broken crawls, selector failures, blocked requests, data mismatches, and job failures. Good-to-Have Technical Skills
- SQL: Validate scraped or loaded data, reconcile records, and support troubleshooting through queries.
- Jenkins: Run scheduled jobs, CI/CD pipelines, automated tests, and deployment checks.
- Zyte: Deploy, monitor, and operate production-grade web scraping workloads.
- Microsoft Power Automate: Trigger workflows, notifications, approvals, and operational routing.
- Microsoft Power Apps: Build lightweight internal apps for tracking requests, runs, and exceptions.
- NLP: Support text extraction, classification, normalization, and content validation use cases.
- Machine Learning: Apply basic anomaly detection or pattern-based validation to improve data quality.
- Node.js: Build lightweight services, APIs, or integration utilities when needed.
- Payment Gateway Integrations: Work with payment APIs, webhooks, transaction payloads, and reconciliation flows. Apply on Kit Job: kitjob.in/job/4msu8u
- Build and maintain web scraping workflows using Python and Scrapy.
- Extract structured data from static and dynamic websites.
- Handle pagination, sessions, retries, throttling, duplicate records, and missing data.
- Use Selenium or Playwright where browser automation is required.
- Integrate with enterprise systems using SOAP APIs.
- Package and deploy automation workloads using Docker.
- Operate and support scraping workloads in Zyte.
- Implement logging, error handling, traceability, and root-cause analysis.
- Maintain code using Git/GitHub best practices.
- Support data validation, reconciliation, and production issue resolution. Must-Have Technical Skills
- Python: Build scraping scripts, automation utilities, validation checks, logging, retries, and error-handling logic.
- Scrapy: Create scalable spiders, manage pagination, pipelines, middlewares, throttling, and structured outputs.
- BeautifulSoup: Parse HTML/XML pages and extract clean data from irregular or nested page structures.
- Selenium: Automate browser-based workflows where UI interaction or verification is required.
- Playwright: Handle modern JavaScript-heavy websites with reliable browser automation.
- SOAP API: Integrate with enterprise services to send, receive, and validate structured payloads.
- Docker: Package scraping and automation workloads for consistent execution across environments.
- Git/GitHub: Manage source code through branches, commits, pull requests, reviews, and version control.
- Logging and Error Handling: Track failures, retries, exceptions, and production issues clearly.
- Production Debugging: Investigate broken crawls, selector failures, blocked requests, data mismatches, and job failures. Good-to-Have Technical Skills
- SQL: Validate scraped or loaded data, reconcile records, and support troubleshooting through queries.
- Jenkins: Run scheduled jobs, CI/CD pipelines, automated tests, and deployment checks.
- Zyte: Deploy, monitor, and operate production-grade web scraping workloads.
- Microsoft Power Automate: Trigger workflows, notifications, approvals, and operational routing.
- Microsoft Power Apps: Build lightweight internal apps for tracking requests, runs, and exceptions.
- NLP: Support text extraction, classification, normalization, and content validation use cases.
- Machine Learning: Apply basic anomaly detection or pattern-based validation to improve data quality.
- Node.js: Build lightweight services, APIs, or integration utilities when needed.
- Payment Gateway Integrations: Work with payment APIs, webhooks, transaction payloads, and reconciliation flows. Apply on Kit Job: kitjob.in/job/4msu8u
Highlights
-
Company nameStraive
-
Job positionPython Web Scraping Engineer (Telangana)
Safety Tips
Be careful if you are offered a job on the spot.
More info about this ad
Python Web Scraping Engineer (Telangana) has been posted in the Jangaon Engineering category on Locanto.
For Jangaon, there are no other ads posted in this category.
There are more ads within a 15 km radius for this category. If you want to view those ads, click here.