Mohit Sharma

Software Developer, +91-9557117565 ยท 19mohitsharma95@gmail.com

Current Programmer Analyst at American Express with 3+ years of experience specializing in Spark and Hadoop. Knowledge of Python,Scala and Machine Learning.

Experience

Programmer Analyst

American Express

Worked as member of internal product owner team (ML model deployment platform), my responsilibities include :

  • Deploying machine learning models by converting python/hive codes to Spark
  • Creating spark applications for data distribution and model performance monitoring
  • Distributing model prediction step of keras based deep learning models using Pandas UDF
  • Creating custom UDFs & JARs for use cases to be used in distributed data pre-processing
  • Designing and management of internal facing products

July 2019 - Present

Systems Engineer

Infosys

Worked as Data Engineer for client Apple Inc. in Anti-Fraud Team to setup input data pipelines and creating Spark Based Applications for providing pre-processed data to Data Scientist team for building their fraud detection models.

July 2017 - July 2019

Education

SRM University

Bachelor of Technology
Computer Science and Engineering

GPA: 9.23

August 2013 - May 2017

Su-Bodh Public School

High School

Percentage: 71.4%

April 2008 - March 2012

Skills

Big Data Ecosystem
Programming Languages
Python Libraries
DBMS & Others Tools

Projects

Fraud Detection using GraphX

Client : Apple Inc.

Developed spark application to help Fraud Detection for Apple using Spark GraphX. Providing fraudulent users data to train Machine Learning Fraud Detection Engine.

  • Implemented unsupervised K-means clustering and fuzzy logic to enhance graphX result.
  • Optimized the application performance by reducing data shuffle and optimizing joins.
  • Utilized Pandas, Matplotlib and NetworkX libraries for visual representation of data and graphs.
  • Automated data validation using Python.
July 2018 - March 2019

ARES Semantic Process

Client : Apple Inc.

Spark based semantic process for extracting key-value pairs from JSON data based on dynamic keys configuration in UI by user.

  • Created hive tables and script to maintain data in hdfs.
  • Data validation between hive source and target teradata database.
  • Automated duplicate json keys validation using Python.
  • Implemented Python script to drop hive tables partitions periodically.
Feb 2018 - July 2018

Awards & Certifications

  • Microsoft MTA for Programming in Python
  • Microsoft MTA for Software Development Fundamentals
  • SoloLearn Certification in Python
  • SoloLearn Certification in SQL
  • SoloLearn Certification in C#
  • HackerRank 1 Silver and 8 Bronze Medals in Competitive Programming
  • HackerEarth Rating in top 10% worldwide with 200+ problem solved