Logically Speaking November 2022: All About Data
November 28, 2022
 
Growth Opportunities
 
Keys to Selling
 

Data This; Data That: The Most Popular Data Jobs and the Skills They Require

by Jeff Felice, President, CertNexus

 
 
Jeff Felice, President, CertNexus
Jeff Felice
President, CertNexus

A decade ago, it was rather easy to understand your relationship to data. In most cases, you were either a database administrator, business analyst, or consumer of data. As a database administrator you “owned” the system and how the data was structured and stored. As a business analyst, you were the go-to “data person” who could access the data and manipulate it into reports and graphs. As the data consumer, you received the reports to use in your decision making.

Although some organizations still operate this way, most have migrated to more complex environments utilizing cloud infrastructure, having multiple sources of data – some which may be unstructured, leveraging tools like AI for automation and predictive capabilities, and employing business intelligence platforms. These advances in technology and the complexity of how we collect, store, engineer, and use our data have resulted in an explosion of data-related roles. In today’s organization, you will see more modern titles such as Data Analyst, Data Engineer, Data Scientist, and others affiliated with data such as a Machine Learning (ML) Engineer.

Now that we have labeled some of the job titles, let us look at each of the roles to define what types of tasks they do and the skills they require.

  • Data Engineers manage data throughout its lifecycle including designing, building, and maintaining data infrastructure whether on premise, in the cloud, or both. Key skills for Data Engineers are the ability to administer databases and other data repositories plus transforming or moving data between these platforms using technologies such as Hadoop and Spark. Other skills include querying languages like SQL, programming languages like Python, C, or Java, and other ETL and database tools.
  • Data Analysts obtain appropriate data, prepare it for analysis, and create reports and visualizations of data. Key skills for Data Analysts include visualization tools such as Microsoft® Excel®, Microsoft® Power BI®, and Tableau®; querying languages like SQL; programming in Python® or R; and for those with more advanced skills, potentially using tools such as Jupyter Notebook or Posit. Data Analysts will also often be required to explain the data and its implications to others, so they will need to be capable presenters as well.
  • Data Scientists perform deeper analysis on the data including developing predictive models to solve more complex data problems. Key skills for Data Scientists include a strong mathematical foundation in the areas of probability, statistics, linear algebra, and calculus; familiarity with tools such as Jupyter Notebook and Posit; programming languages such as Python or R; and modeling with Pandas, PyTorch and similar environments. Data Scientists typically will also possess some domain expertise.
  • Machine Learning Engineers create algorithms and models that use data to learn and generate predictive outcomes. Key skills for Machine Learning Engineers include programming skills in Python or R; mathematical foundations in linear algebra, probability, statistics, and multivariate calculus; knowledge of ML algorithms; familiarity with tools such as Jupyter Notebook and Posit, and data modeling.

For those of us who have been in the industry for a longer period, we can see that the modern data roles are still built on specific skills and tools. The difference is that now we need a broader set of skills that include both vendor-agnostic and vendor-specific tools regardless of role. In addition, we need to continue to refresh our knowledge and skills at a more rapid pace than ever before. But for many of us, that is exactly why we chose a profession in technology – to be challenged throughout our careers.

​​​​

Curriculum Corner
 

All Aboard the Data Science Train

by Jason Nufryk, Instructional Designer

 
 
Jason Nufryk, Instructional Designer
Jason Nufryk
Instructional Designer

Data scientists are in demand, but employers have been struggling to find qualified personnel who can extract real value from the organization’s data. Partly, this is because there are many dimensions to data science and a staggering number of tools and technologies. People who want to pivot to a data science career, or to improve their data science skillset, don’t know where to start. Nevertheless, there is a clear path forward. 

Think of your data science journey as a cross-country train trip. The train stops at various stations on the line; you board at your starting point and continue until you reach your destination. On this data science journey, each station is a chance to learn some specific skill or technology in the field, with each stop bringing you closer to your destination as a data science practitioner. 

If you’re a complete newcomer to data science, with skills in basic computing and productivity apps like the Microsoft® Office suite, you’ll board near the start of the line. Spreadsheet Station offers fundamental data entry, manipulation, and analysis skills using Microsoft® Excel®. Most data scientists will want mastery of this important business app before continuing on the train. 

If your Excel mechanics are already solid, get on at Data Analysis Depot. Here, you’ll harness the true power of Excel for data analysis, wrangling, and visualization. You’ll use PivotTables to manipulate and reshape data to reveal insights, create dashboards with graphs and charts to make identifying key information a breeze, enhance your analysis capabilities with Power Pivot, and write Visual Basic for Applications (VBA) code to automate repetitive data transformation tasks. 

Excel data analysts might board the data science train at Visualization Station. On this part of the ride, you’ll use tools like Tableau® and Microsoft® Power BI® that are purpose built for data analysis and visualization. Each tool provides a complete platform for collecting, modifying, and analyzing data in powerful ways. 

For many data analysts, that might be the last stop. But the data science train ride doesn’t end there. Keep going through Programming Junction to unlock the true potential of data science with object-oriented programming. Python® is your best choice, because Python is the dominant language in the field of data science. But you can also change here for the R programming language, which is also popular in the field.

 

Python programmers might join the train as it passes through the plains of Python data science libraries—NumPy, pandas, Matplotlib, and more. These tools take you above and beyond the capabilities of GUI applications like Excel and Tableau. With Python libraries, you can automate data collection, data preparation, and data visualization; integrate into existing software environments; and deal with disparate data sources and data formats. And, leveraging data science programming libraries makes it easier to scale up your efforts in big data environments. 

Before you reach the final station, you might ask the conductor to stop the train at some smaller stations where you can enhance your personal portfolio of knowledge and skills. For instance, you might make a whistle stop at SQL Station; querying relational databases with Structured Query Language is important since much of the data you’ll need to extract comes from these databases. Another good request stop is immersing yourself in the ethical challenges that data science and other emerging technologies pose. That way you can ensure your work minimizes harm and upholds principles like equity and privacy. 

At the end of the line, you’ll be ready to combine all the skills and knowledge from all your stops—data collection, cleaning, processing, visualization, analysis—and apply those to an overall business project that follows repeatable tasks in a lifecycle. And, you’re finally ready to tackle artificial intelligence (AI), machine learning, and deep learning through Python libraries like scikit-learn, TensorFlow, and PyTorch. With these libraries and machine learning theory, you can realize business value through predictive modeling, product recommendation, sales forecasting, computer vision, natural language processing (NLP), and so much more.  

So, check out our data science learning path and get on board the data science train! We’ll help you hop on and hop off at the right stops as you head toward your final destination: data science expertise. 

​​​​

________________________________________

 

Latest Product Highlights

 

Reminder: The CertNexus CyberSAFE: Exam CBS-410 eLearning is available!

It includes everything a student needs to be CyberSAFE: eLearning to become more aware of technology-related risks plus the assessment and credential. Shop the CyberSAFE: Exam CBS-410 eLearning now.​​​​​​

 
   
 
 

________________________________________

 

Content Revisions

 

Logical Operations revises student and instructor materials based on technical changes, customer feedback, and our own assessment of necessary changes. The revision notes for the most recent updates are below as well as posted on the Content Revisions page. Use this page as a resource to quickly access and view all revision details for any of our recent course updates. 

​​​​​
Reminder: When viewing a product on the store, check the Revision Information tab to see the summary description of the most recent revision for that product at any time.
 

Screenshot of revision field on Logical Operations store/>

 

Client Services Corner