We’re looking for a Hadoop Data Ingestion! Reach out if you’re interested and feel free to refer friends/colleagues!
Type of Employment: Contract
Title: Hadoop Consultant with Data Ingestion
Term: 12 months + extension
Location: Remote, for now/ Toronto
Job ID number: C1231
Brief description of duties:
- Participate in all aspects of Big Data solution delivery life cycle including analysis, design, development, testing, production deployment, and support
- Develop standardized practices for delivering new products and capabilities using Big Data technologies, including data acquisition, transformation, and analysis
- Ensure Big Data practices integrate into overall data architectures and data management principles (e.g. data governance, data security, metadata, data quality)
- Create formal written deliverables and other documentation, and ensure designs, code, and documentation are aligned with enterprise direction, principles, and standards
- Train and mentor teams in the use of the fundamental components in the Hadoop stack
- Assist in the development of comprehensive and strategic business cases used at management and executive levels for funding and scoping decisions on Big Data solutions
- Troubleshoot production issues within the Hadoop environment
- Performance tuning of Hadoop processes and applications
- Proven experience as a Hadoop Developer/Analyst in Business Intelligence
- Strong communication, technology awareness, and capability to interact work with senior technology leaders is a must
- Good knowledge of Agile Methodology and the Scrum process
- Delivery of high-quality work, on time and with little supervision
- Critical Thinking/Analytic abilities
- Bachelor in Computer Science, Management Information Systems, or Computer Information Systems is required.
- Minimum of 4 years of Building Java apps
- Minimum of 2 years of building and coding applications using Hadoop components – HDFS, Hive, Impala, Sqoop, Flume, Kafka, StreamSets, HBase, etc.
- Minimum of 2 years of coding Scala / Spark, Spark Streaming, Java, Python, HiveQL
- Minimum 4 years understanding of traditional ETL tools & Data Warehousing architecture.
- Strong personal leadership and collaborative skills, combined with comprehensive, practical experience and knowledge in end-to-end delivery of Big Data solutions.
- Experience in Exadata and other RDBMS is a plus.
- Must be proficient in SQL/HiveQL
- Hands-on expertise in Linux/Unix and scripting skills are required.
Nice to haves:
- Strong in-memory database and Apache Hadoop distribution knowledge (e.g. HDFS, MapReduce, Hive, Pig, Flume, Oozie, Spark)
- Past experience using Maven, Git, Jenkins, Se, Ansible, or other continuous integration tools is a plus
- Proficiency with SQL, NoSQL, relational database design and methods
- Deep understanding of techniques used in creating and serving schemas at the time of consumption
- Identify requirements to apply design patterns like self-documenting data vs. schema-on-read.
- Played a leading role in the delivery of multiple end-to-end projects using Hadoop as the data platform.
- Successful track record in solution development and growing technology partnerships
- Ability to clearly communicate complex technical ideas, regardless of the technical capacity of the audience.
- Strong interpersonal and communication skills including written, verbal, and technology illustrations.
- Experience working with multiple clients and projects at a time.
- Knowledge of predictive analytics techniques (e.g. predictive modeling, statistical programming, machine learning, data mining, data visualization).
- Familiarity with different development methodologies (e.g. waterfall, agile, XP, scrum).
- Demonstrated capability with business development in the big data infrastructure business
|Building Java apps||4|
|Building and coding applications using Hadoop components||2|
|Scala / Spark, Spark Streaming, Java, Python, HiveQL||2|
|SQL/Hive SQL||Must have|