Marc Matt
Verified Expert in Engineering
Data Engineer and Developer
Marc是一名对数据充满热情的数据工程师,在领导团队和构建专注于信息技术的数据平台方面拥有15年以上的经验, real estate, and services industries. 他创建了一个基于python的AVRO模式生成器,使方案的部分可重用. Marc excels with automation, integrations, analysis, the building of models, statistics, big data, CI/CD pipelines, and data modeling.
Portfolio
Experience
Availability
Preferred Environment
Apache气流,Tableau服务器,Tableau, SQL, Pandas, Python, Apache Beam, Git, Linux
The most amazing...
...我开发的应用程序可以实时提供姿势估计数据,以帮助优化客户的健身目标.
Work Experience
Senior Data Analyst
Bold Metrics Inc.
- Created a template for ad hoc reporting for all clients.
- 使用Amazon Kinesis设计并实现了流数据进入数据仓库, Lambda, and Python.
- 在Redshift数据仓库中优化和标准化转换.
Data Engineer
MediaMarktSaturn Retail Group
- 建立了全国配送中心的供应链监控系统.
- 对所有物流服务供应商实施api,并将其转换为公司范围内的报告.
- 在GKE上使用Apache NiFi建立一个实时订单跟踪系统.
Cloud Data Engineer and Architect
Spin (Tier Mobility) - Main
- 使用Google Vertex AI设计并建立了MLOps工作流.
- Operationalized ML models for real-time use cases.
- 准备将DWH从BigQuery迁移到Snowflake.
- 建立交通违章事件的操作支援工具.
ETL Engineer
Food Marketing Company
- Parsed JSON data in Talend and loaded it into Redshift.
- Integrated data from web APIs with Talend into Redshift.
- 使用Talend转换客户数据并将其加载到Salesforce.
Data Engineer
Janus
- 将传统ETL管道转换为可扩展的AWS Glue作业.
- Automated resource deployment using AWS CloudFormation.
- 在PySpark中设计和构建框架,使将来添加管道更容易.
Senior Data Engineer
Emma
- 为数据平台设计了一个新的数据输入API,支持流分析.
- 使用Kinesis设置binlog流处理和实时事件解析, Lambda, and Kinesis Data Firehose.
- 通过分析查询和表来优化Redshift中的数据加载,以添加优化的排序和磁盘键.
Data Specialist
Ear-Reality GmbH
- 开发了一个基于Kinesis和Athena的数据湖,包括在Metabase中嵌入报表.
- 将生产系统转移到无服务器可扩展架构.
- 使用Python和Locust对应用程序进行自动负载测试.io.
Senior Data Engineer
Engel & Völkers
- 设计并搭建了一个数据平台,包括工具选择和数据建模.
- 建立了一个TensorFlow模型来预测实时环境中的属性值.
- 实现了CI/CD管道来自动部署数据平台的所有特性.
Head of Data Engineering | Machine Learning
Surf Media
- 领导一个六人的团队,并负责他们的个人发展.
- 设计大数据系统和数据湖,包括工具选择和数据建模.
- 为推荐引擎和欺诈的开发设计数据管道和模型选择. The recognition systems work in a real-time environment.
- Created the technology roadmap. Oversaw the advancement of all affected data systems.
Business Intelligence Analyst
Surf Media
- 为由五家公司组成的公司集团设计、开发和运营DWH.
- Developed a statistical model for predicting orders.
- 分析客户,了解如何在社交网络中优化收益.
Database Consultant
EOS Information Services, GmbH.
- 为风险管理中的决策引擎设计、开发和操作DWH.
- Designed processes for risk management.
- 使用Perl和Uniserv完成了地址管理过程的构思和开发.
Datawarehousing Consultant
Key-Work Consulting, GmbH.
- Migrated the sales reporting for a mailorder company.
- 开发了一个统计模型来优化邮购公司的销售计划.
- 建立了动态运输计划的统计模型.
Database Management
Coxulto Marketing Solutions, GmbH.
- 为市场营销活动定义和选择目标群体.
- 完成对整个客户群的亲和力分析.
- 管理和操作地址数据库,包括重复终止.
Lead of Business Intelligence Consumer Products
1&1 Internet A
- 协调和优先处理商业智能团队的所有任务.
- 为董事会设计和制定KPI报告.
- 分析客户结构,建立客户流失预测模型.
Business Intelligence Analyst
1&1 Internet AG
- 设计和开发客户和合同库存的自动报告系统, as well as internet usage and customer behavior.
- 将公司网站的客户使用数据整合到DWH中.
- 协调管理部门和开发部门之间的所有任务.
- 分析所有新老客户活动的有效性.
Experience
AVRO Schema Generator
http://gitlab.com/datascientists.info/avro-generatorIf certain data structures are used in several schemas, 该工具只提供一次定义这些结构,然后在多个模式上重用它们的能力.
Evalution of Property Value
Design and Set-up of Data Platform
Skills
Languages
Python, SQL, Perl, Java, XML, Snowflake, Python 3, TypeScript
Tools
BigQuery, Apache HAWQ, Apache Avro, Git, Apache Beam, Tableau, Apache Airflow, Jenkins, Apache NiFi, RabbitMQ, Microsoft Excel, Terraform, Amazon Elastic MapReduce (EMR), Amazon EKS, AWS IAM, Google Kubernetes Engine (GKE), Talend ETL, Amazon Athena, AWS CloudFormation, Redshift Spectrum, Matillion ETL for Redshift, AWS Fargate, AWS Glue
Paradigms
ETL、商业智能(BI)、数据科学、DevOps、微服务
Platforms
Amazon Web Services (AWS), Linux, Docker, Talend, Hortonworks Data Platform (HDP), Oracle, AWS Lambda, Google Cloud Platform (GCP), Kubernetes, AWS Elastic Beanstalk
Storage
MySQL, Google Cloud, Database Modeling, Redshift, Databases, Database Architecture, SQL Server 2010, Data Pipelines, Amazon S3 (AWS S3), PostgreSQL, Google Cloud SQL, Data Lakes, Apache Hive, HDFS, NoSQL, Amazon Aurora, JSON
Other
Data Visualization, Data Analysis, Data Architecture, Data Engineering, Data Warehousing, Data Modeling, Data Warehouse Design, Data Reporting, Database Schema Design, Data Management, Google Cloud Functions, Cloud Run, APIs, Data Wrangling, ETL Tools, Tableau Server, Google BigQuery, Data Profiling, Google Data Studio, Fivetran, Serverless, Scaling, Dashboards, Amazon Kinesis, Parquet, Cloud Architecture, Big Data, Architecture, Big Data Architecture, Machine Learning Operations (MLOps), CI/CD Pipelines, Cloud Security, Kubeflow, Data Build Tool (dbt), Cloud Tasks, Azure Databricks
Frameworks
Spark, Apache Spark, Flask, Django, Hadoop,无服务器框架
Libraries/APIs
Pandas, PySpark, TensorFlow, REST APIs, Node.js