Senior Consultant - Talend DI
Job Title: Senior Consultant
Location: TRIL GTC Chennai
AstraZeneca is a global, innovation-driven biopharmaceutical business that focuses on the discovery, development and commercialization of prescription medicines for some of the world's most serious diseases. But we're more than one of the world's leading pharmaceutical companies. At AstraZeneca, we're proud to have a unique workplace culture that inspires innovation and collaboration. Here, employees are empowered to express diverse perspectives and are made to feel valued, energized and rewarded for their ideas and creativity.
Department – Data & Analytics, R&D IT
R&D IT is a global IT capability supporting Drug Research, Drug Development, Product & Portfolio Strategy, Medical Affairs, Finance, HR, Compliance, Legal and Global Business Services. We are organized around 7 key capability areas: Business Partnering, Solution Delivery, Architecture, Application Support, Data & Analytics, Change & Operations, operating out of sites across the US, UK, Sweden, India and Mexico.
The Data & Analytics team provides technical support to analytics and data insight services and solutions critical to the Data & AI/ML emerging strategy and mission of R&D Science IT and AZ. Data & Analytics is organized into teams specializing in Information Architecture, Data Engineering, Visual Engineering, Knowledge Management, Data Science, Data Analysis and Information Governance.
We are looking for a Data Engineer to help us build intelligent applications that make use of our structured and unstructured data to derive key insights. As part of the R&D Data Foundation engineering group you will work together with ML engineers and data scientists to build the data foundations supporting R&D.
We are building a global Competitive Intelligence platform that will provide industry leading competitive intelligence across our R&D and Commercial organisations. As a member of our team, you will be primarily responsible for implementing ETL processes.
You should be well-versed in the design and development of ETL and database developments for large data products, as well as maintaining and supporting production environments.
- Part of a DevOps team implementing and supporting ETL workflows. Data sources will be: structured, semi-structured and unstructured.
- Working with suppliers, data scientists, machine learning engineers, and platform teams to acquire and process data.
- Analysing data requirements, source data, model the source, and figure out best methods in extracting, transforming and loading the data into the data lake and processing the data through the layers of the lake.
- Providing technical recommendations around design, architecture, integration and support of the entire data sourcing platform with a focus on high availability, performance, scalability and maintainability.
- Act as the ETL technical liaison working with technical infrastructure teams to resolve problems and implement solutions to technical issues impacting application performance
- Leading data administration tasks such as scheduling jobs, troubleshooting job errors, identifying issues with job windows, assisting with backups, rollback and performance tuning.
- Test, document and quality assess new data solutions, to ensure they are fit for release.
- Communicate and coordinate with members of the development team to work across multiple projects. Explore, actively support and work on new technology initiatives that may be of interest to the organization.
- Run automation of all ETL processes within a job workflow
- Documentation of data engineering workflows to support downstream use
- Testing of data in analytics applications, to ensure data validity and reconciliation to source systems
- Development of domain expertise in sub-domains of the Science & Enabling Unit portfolio – understanding of business process, data flows, data provenance, data restrictions and data use.
Highly Desirable Knowledge, Skills and Experience
- 6+ years of experience on data engineering– ETL workflows
- B.Tech/ M.Tech/MSc in Computer Science.
- A strong understanding of databases and source systems, including experience with , RDBMS, NoSQL and Graph technologies.
- Understanding and using APIs to pull data
- Experience of Object Oriented programming (Java, Python, C#) & Shell Scripting.
- Experience of using ETL/orchestration tools & software (Talend, Airflow, Snaplogic, Informatica, etc.). currently we are using Talend
- Proven record of ETL pipeline performance tuning
- Good software development skills with demonstrable knowledge of Python and Java, and source control (GIT) & versioning
- Working knowledge of cloud environments (AWS preferred)
- Hands-on experience with developing large scale data pipelines
- Hands-on experience with big data technologies like Hadoop ,Spark etc)
- Experience of semi-structured (XML, JSON) and unstructured data handling including extraction and ingestion via web-scraping and FTP/SFTP.
- Excellent communication and facilitation skills.
- Good written and verbal skills, fluent English.
- Experience of delivering solutions within IT projects delivered through Agile and Waterfall methodologies.
- Debugging & root cause analysis
- Enthusiasm to learn and apply new concepts and technologies with ‘can do’ attitude
Additional skills and experience sought
In addition, these are the bonus skills for this position:
- Experience of working with data scientists and their methods: understanding of how data needs to be prepared for use by data scientists.
- Experience of working within a range of data architectures
- Knowledge of deploying applications on Kubernetes clusters
- Knowledge of Apache Spark application development
- Supporting a data centric application.
- Working with APIs (support or development).
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status.
AstraZeneca embraces diversity and equality of opportunity. We are committed to building an inclusive and diverse team representing all backgrounds, with as wide a range of perspectives as possible, and harnessing industry-leading skills. We believe that the more inclusive we are, the better our work will be. We welcome and consider applications to join our team from all qualified candidates, regardless of their characteristics. We comply with all applicable laws and regulations on non-discrimination in employment (and recruitment), as well as work authorisation and employment eligibility verification requirements.