Marc P.

Data Scientist

690 dollar
Freelancer
1 year
Paris, FRANCE

My experience

More

DataScientest.comFebruary 2019 - May 2019

Creating a recommendation engine for movies withon Python and Spark.
The
algorithm is based on the 20 million ratings awarded by 30,000 users to
100,000 movies as well as descriptive data on movies and users.

Main steps implemented:
- Selection, cleaning, creation of new variables
- Descriptive analyzes of the variables
- Creating K-Means Clusters by User Group and Movies
- Regressions Linear, Ridge, Lasso
- Collaborative filtering recommendation algorithms, alternating least square, stochastic gradient descent.
More

Bouygues TelecomMarch 2018 - July 2018

Analyse du déploiement de la fibre

Principales étapes:
-
Enrichissement et analyse d'une bases de données d'un parc de 30 000
équipements et de 50 variables avec Excel, VBA et Microsoft ACCESS.
- Production d'un rapport de cohérence.
- Production d'indicateurs de performance.
More

EDFApril 2017 - September 2017

Design of an energy efficiency tool to identify customers in overconsumption in professional circles.

Method implemented: Statistical analysis of time series of electrical consumption. Modeling of load curves. Creation of a web graphical interface (Shiny under R) to share the results.


Mission creation of an energy efficiency tool for professionals. This tool establishes the theoretical consumption of users and compares it with actual consumption to identify overconsumption.

Hardware: R and VBA software.
Database of 500 clients. For each client about 20 variables (Area, Number of Employees, Geographical area ....) And load curves about 3000 data.

On VBA
-Selection of useful data
-Group of data
-Cleaning of the data

On R
-Normalization of data
-Classification with machine learning
-Proposal of models
-Test of models
-Programming a web graphical interface (R Shiny) to share the results
More

Texas A&M Institute for Preclinical Studies (TIPS)September 2015 - December 2015

Digital design of a research unit (NITRA SOLID) from Solidworks.
Mechanical and thermal simulations to determine the behavior of the device according to various constraints.

My stack

Computer Tools

MS Office, Microsoft PowerPoint, Microsoft Excel

Application servers

Apache Web Server

Databases

MongoDB

Others

Teamwork

CAD&CAM

SolidWorks, Catia

Technologies

MapReduce, NumPy, Machine Learning, Amazon Web Services (AWS)

Embedded and Telecom

Simulink

Languages

R Language, C/C++, Python 3.5, Python, VBA, SQL

Big Data

Oozie, Big Data, Hadoop, Spark

Other

Data Scientist, OCR, Visual Basic for Applications, IBM Hardware, Président, Master 2, Diplôme d'ingénieur, Ascential DataStage, Apache Flume, Apache Cassandra, Apache Hive, Consultant Data Scientist, Sqoop, Python Programming, Physique-Chimie, Ingénieur Statisticien Junior Projet d'étude Corsica, Secrétaire du Bureau Régional, Assistant de Recherche Junior, English, German, Microsoft Access

My education and trainings

- - Licenses & Certifications

- - Organization

Master 2 (M2) - Marchés et droit de l'énergie - Université de Montpellier2016 - 2017

Diplôme d'ingénieur - Ingénierie - EPF Ecole d'Ingénieurs2014 - 2017

Deuxième Année - Physique - Chimie x - Condorcet2011 - 2014