Big Data/Hadoop Specialist
- Plano, Texas,United States
- 1 year ago
Requisition Id : 9866360
Good understanding of Yarn, Spark UI, Spark resource management and Hadoop resource management and efficient Hadoop storage m
Responsible for designs, develops, modifies, debugs and/or maintains software systems, Serves as an expert on specific modules, applications or technologies, and deals with complex assignments during the software development life cycle.
What will your job look like?
• You will take ownership and accountability of specific modules within an application and provides technical support and guidance during solution design for new requirements, problem resolution for critical / complex issues
• You will Ensures code is maintainable, scalable and supportable.
• You will present demos of the software products to stakeholders and internal/external customers, using knowledge of the product/solution and technologies to influence the direction and evolution of the product/solution.
• You will investigate issues by reviewing/debugging code and providing fixes (analyzes and fixes bugs) and workarounds, will review changes for operability to maintain existing software solutions, will highlight risks and will help mitigate risks from technical aspects.
• You will bring continuous improvements/efficiencies to the software or business processes by utilizing software engineering tools and various innovative techniques, and reusing existing solutions. By means of automation, reduces design complexity, reduces time to response, and simplifies the client/end-user experience.
• You will represent/lead discussions related to product/application/modules/team (for example, leads technical design reviews). Builds relationships with internal customers/stakeholders
• Bachelor's or Master's degree in Science/IT/Computing/Business Analytics or equivalent.
• 5+ years total experience in development mainly around Bigdata, Hive & Hadoop, Spark, Scala, Python/Pyspark, AWS, and other cloud related technologies.
• Experienced in Data Engineering with good understanding of Datawarehouse, Data Lake, Data Modelling, Parsing, Data wrangling, Cleansing & Transformation, and sanitizing.
• Agile work experience, build CI/CD pipelines using Jenkins
• Hands-on Development experience with Scala, Python using Spark 2.0, Spark Internals and Spark jobs performance improvement
• Good understanding of Yarn, Spark UI, Spark resource management and Hadoop resource management and efficient Hadoop storage mechanisms.
• Good understanding & experience with Performance tuning in Cloud environment for complex S/W projects mainly around large scale and low latency.
• AWS knowledge is essential with good working experience in AWS Technologies - EMR, S3, Cluster management, AWS Airflow automation.
• Snowflake Knowledge is plus.
• AWS development certification/Spark certifications is an advantage