Job description
About the role
Digital Technology / Innovation AI Machine Learning Data Engineering Team is looking for a highly motivated and experienced Senior Data Engineer. The right candidate will have expert level experience in supporting Artificial Intelligence / Machine Learning (AI/ML) Platforms, products and data ingestion/provisioning activities from different data sources to and from the Enterprise Data Lake. As a senior data engineer, you will be working with Wells Fargo business & data science teams to get the business data requirements, perform data engineering/provisioning activities to support building, exploring, training and running Business models. The senior data engineer will use ETL tools like Informatica, Ab Initio, and data warehouse tools to deliver critical Model Operationalization services to the Enterprise.
Digital Technology / Innovation AI Machine Learning Data Engineering Team is looking for a highly motivated and experienced Senior Data Engineer. The right candidate will have expert level experience in supporting Artificial Intelligence / Machine Learning (AI/ML) Platforms, products and data ingestion/provisioning activities from different data sources to and from the Enterprise Data Lake. As a senior data engineer, you will be working with Wells Fargo business & data science teams to get the business data requirements, perform data engineering/provisioning activities to support building, exploring, training and running Business models. The senior data engineer will use ETL tools like Informatica, Ab Initio, and data warehouse tools to deliver critical Model Operationalization services to the Enterprise.
Responsibilities:-
- Data modeling, coding, analytical modeling, root cause analysis, investigation, debugging, testing and collaboration with the business partners, product managers, architects & other engineering teams.
- Adopting and enforcing best practices related to data ingestion and extraction of data from the big data platform.
- Extract business data from multiple data sources and store in MapR DB HDFS location.
- Work with Data Scientists and build scripts to meet their data needs
- Work with Enterprise Data Lake team to maintain data and information security for all use cases
- Build automation script using AUTOSYS to automate the loads
- Design and develop scripts and configurations to successfully load data using Data Ingestion Frameworks or Ab initio
- Coordinate user access requests for data loaded in Data Lake
Essential Qualifications
- BS/BA degree
- Possession of excellent analytical and problem-solving skills with high attention to detail and accuracy
- Demonstrated ability to transform business requirements to code, metadata specifications, specific analytical reports and tools
- Good verbal, written, and interpersonal communication skills
- 5+ years of ETL (Extract, Transform, Load) Programming with tools including Informatica
- 2+ years of Unix or Linux systems with scripting experience in Shell, Perl or Python
- Experience with Advanced SQL (preferably Teradata)
- Experience working with large data sets, experience working with distributed computing (MapReduce, Hadoop, Hive, HBase, Pig, Apache Spark, etc.).
- MS/MA degree
- Experience with Java and Scala
- Experience with Ab Initio
- Experience with analytic databases, including Hive, Presto, and Impala
- Experience with multiple data modeling concepts, including XML and JSON
- Experience with stream frameworks including Kafka, Spark Streaming, Storm or RabbitMQ
- Experience working with one or more of the following Amazon Web Services (AWS) Cloud services: EC2, EMR, ECS, S3, SNS, SQS, Cloud Formation, Cloud watch
- In depth understanding and knowledge of Hadoop and Spark architecture and RDD transformations.
- Atleast 3 or more years of relevant experience in developing pySpark programs using APIs. Expertise in different file formats like parquet, ORC.
Full Time, Permanent
Programming & Design
0 Comments