Marius B.

Data Engineer

390 dollar
Freelancer
2 years
Cluj-Napoca, ROMANIA

My experience

More

YardiAugust 2019 - Present

- Aggregate large datasets from multiple sources.
- Identify patterns, analyze and interpret complex datasets in order to be able to extract relevant information.
- Research missing/incomplete/faulty data.
- Assess, ensure and maintain data quality.
- Create /Use /Maintain Perl & SQL scripts responsible for ETL transactions, data parsing, and tasks automatization.
- Track errors and provide ongoing appropriate inter-departmental communication and monthly or daily data reports.
Stack: Perl, Oracle SQL, Git, SVN, Linux, RegEx, Jenkins.
More

DXC LuxoftMay 2021 - Present

- Implement and develop auto-orchestrated Data Ingestion Framework (internal framework).

- Leverage Azure Cloud to create and maintain ETL/ELT environments.

- Develop new features within the framework or as part of the native data-flow leading to it.

- SME in initial deploying and start of the framework in new clients environment.


• Stack: Python, PySpark, DataBricks, Azure DevOps, Azure Data Factory, SQL, Azure SQL, Delta Lakes.

More

SteelcaseJanuary 2021 - Present

• Work in an Agile team to support multiple stakeholders and fulfill the data need in:

-- ML /AI projects for Data Science

-- Visualization projects for Data Analytics.

• Leverage Azure Data Factory, Azure DataBricks, Azure DevOps, Azure SQL, Snowflake... to create and maintain ETL/ELT data pipelines having multiple ingestion points / transformations of variable data formats.

• Lead new features and high impact projects which coordinate resources from

multiple sources and teams in a decentralized department.

• Leverage and improve DataBricks spark environment for efficient data manipulation / transformation.

• Drive the improvement of current project architecture as follows:

 -- Make use of the new Repos feature in DataBricks to improve upon current CI/CD.

 -- Apply different layers to project structure to drive away from heavy notebooks scripting and towards modularization of code.

 -- Migrate towards more functional & OOP structured code for better QA.


• Stack: Python, PySpark, DataBricks, Azure DevOps, Azure Data Factory, SQL, Java, Snowflake, Oracle SQL, Azure SQL, Delta Lakes.

My stack

Databases

MySQL, Oracle, PostGreSQL, Microsoft SQL Server

Big Data

PySpark

Other

Research Analyst, BigData Development, Senior Data Engineer, Line Coordinator, Microsoft Office, Data Analyst, develop auto, Oracle SQL, tableau, Scrum Methodology, Python Programming, Regular Expressions, Error handling, Spring Framework, Cascading Style Sheets, Data Collection, LinkedIn, Real Estate Market Analyst, Bachelors Degree, Windows Azure Platform, Apache Subversion, industry~it, RDBMS, FOCUS, Perl Programming, Azure SQL, Object Oriented Analysis/Design, Snowflake, Data Engineer

Frameworks

REST, JPA, Spring

Middleware

Jenkins

Analysis methods and tools

Kanban, Apache Maven, Design Patterns, DevOps, JIRA, Confluence, Subversion (svn)

Environment of Development

GitLab, Maven

Others

Project Management, Leadership, Communication, Analytics, GitHub, Data analysis, Artificial Intelligence, GitFlow

Computer Tools

Microsoft Excel, MS Office

IT Infrastructure

Linux, Azure Cloud, Azure DevOps, Git, Docker

Technologies

Azure Data Factory

Business Intelligence

Business Intelligence, ETL, Tableau Software

Languages

Java, Regex, MVC, Oracle PL/SQL, HTML, OOP, Python, Bash, Perl, JavaScript, SQL