what is data science engineering
Data Science is a unique multidisciplinary confluence of Computer Science, Computational Mathematics, Statistics, and Management. Data engineering involves data collection methods, designing enterprise data storage, and retrieval.
Data engineering is the aspect of data science that focuses on practical applications of data collection and analysis. For all the work that data scientists do to answer questions using large sets of information, there have to be mechanisms for collecting and validating that information. In order for that work to ultimately have any value, there also have to be mechanisms for applying it to real-world operations in some way. Those are both engineering tasks: the application of science to practical, functioning systems.
Data engineers focus on the applications and harvesting of big data. Their role doesn’t include a great deal of analysis or experimental design. Instead, they are out where the rubber meets the road (literally, in the case of self-driving vehicles), creating interfaces and mechanisms for the flow and access of information. They may be experts in:
1. System architecture
2. Programming
3. Database design and configuration
4. Interface and sensor configuration
1. Skills:
Data Analysts need to have a baseline understanding of some core skills: statistics, data munging, data visualization, exploratory data analysis,
Tools: Microsoft Excel, SPSS, SPSS Modeler, SAS, SAS Miner, SQL, Microsoft Access, Tableau, SSAS.
2. Business Intelligence Developers:
Business Intelligence Developers are data experts that interact more closely with internal stakeholders to understand the reporting needs, and then to collect requirements, design, and build BI and reporting solutions for the company. They have to design, develop and support new and existing data warehouses, ETL packages, cubes, dashboards, and analytical reports.
What do data science engineers do?
Data engineers build and maintain the systems that allow data scientists to access and interpret data. The role generally involves creating data models, building data pipelines, and overseeing ETL (extract, transform, load). Data scientists build and train predictive models using data after it's been cleaned.
No comments:
Post a Comment