Getting Started

Mastery of this intuitive statistical concept will advance your credibility as a decision maker.

Photo by Ella Olsson from Pexels

Bayes Theorem gives us a way of updating our beliefs in light of new evidence, taking into account the strength of our prior beliefs. Deploying Bayes Theorem, you seek to answer the question: what is the likelihood of my hypothesis in light of new evidence?

  1. Updating
  2. Communicating
  3. Classifying

By the end, you’ll possess a deep understanding of the foundational concept.

#1 — Updating

Bayes Theorem provides a structure for testing a hypothesis, taking into account the strength of prior assumptions and the new evidence. …


Even in the aftermath of the replication crisis, statistical significance lingers as an important concept for Data Scientists to understand

Photo by Pixabay from Pexels

There are many types of statistical testsnull hypothesis significance testing predominates.

Overview

The goal of the researcher conducting the null hypothesis test is to evaluate whether or not…


Drop in for some tips on how this fundamental statistics concept can improve your data science.

Photo by Cameron Casey from Pexels

The distribution of data refers to the way the data is spread out. In this article, we’ll discuss the essential concepts related to the normal distribution:

  • Ways to measure normality
  • Methods to transform a dataset to fit the normal distribution
  • Use of the normal distribution to represent naturally occurring phenomena and offer statistical insights

Overview

Data distribution is of great importance in statistics because we are pretty much always sampling from a population where the full distribution is unknown. The distribution of our sample may put limitations on the statistical techniques available to us.


The tools you need to succeed with machine learning in the new year.

Photo by Ian Schneider on Unsplash

Web Resources

🔦 ML Showcase — great for project inspiration, this repository of data science projects from Team Paperspace should certainly get your wheels turning.


The sixth tool is coffee.

Photo by Chevanon Photography from Pexels

In Stephen Covey’s masterful 7 Habits of Highly Effective People, the seventh habit is “sharpen the saw.” This refers to enhancing our assets to seek continuous improvement in our work. As Abe Lincoln said,

Give me eight hours to chop down a tree, and I will spend the first six sharpening the saw.

Better tools to structure, simplify, and broaden our Data Science work will make us more effective thinkers, decisionmakers, and practitioners.


A step-by-step walkthrough for a simple portfolio project using sklearn’s clustering algorithm to create an interactive dashboard for your city.

Through unsupervised learning, a data scientist can explore an unlabeled dataset to produce categories or clusters. You can use this technique to create a neighborhood explorer tool to help residents and visitors develop an understanding of points of interest near where you live.

via GitHub

You can use a publicly available points of interest dataset like I did, or you could scrape data from Yelp or TripAdvisor.

I was inspired to create this project based on my interest in GIS data and my love of Washington, DC, where I went to…


Office Hours

The size of the digital universe increased 3000% in the past decade. Here’s how to manage all your organization’s data.

Photo by Dino Reichmuth on Unsplash

This article will help you understand the whys and hows of implementing better data management practices at your organization.


Statistics, SQL, Python, and machine learning are all important capabilities to master in the year ahead.

Tools for Data Science detective work. Photo by ian dooley on Unsplash

2020 has been a rough year. Amidst the pandemic, economic fallout, quarantine orders, racial reconning, a stressful U.S. election, holidays spent separated from loved ones, important milestones passed without the recognition they deserved, and other deeply tragic circumstances — something troubling happened to me.


Post-COVID, machine learning is increasingly crucial for business success.

Photo by cottonbro from Pexels

COVID-19 accelerated the end of 20th-century trends and entrenched the dominance of 21st-century paradigms. For millions around the world, the pandemic response will forever shape the way we work, where we chose to live, and how we engage in commerce.

I’m telling everyone to update their model. You should really only use data since COVID if possible.

Dr. Carl gold, former Wall Street quantitative analyst turned Chief Data Scientist and author of Fighting Churn with Data, urges Data Scientists to rethink their approach to modeling customer behavior due to model drift, which occurs when the training dataset no longer faithfully…


Python is the fastest growing, most-beloved programming language. Get started with these Data Science tips.

Photo by Shelby Miller on Unsplash

With Python’s straightforward, human-readable syntax, anyone can access impressive capabilities for scientific computing. Python has become the standard language for data science and machine learning, and it was rated in the top three most loved languages in Stack Overflow’s 2020 Developer Survey.

#10 — List comprehensions

A simple, single-line syntax for working with lists, a list comprehension allows you to access and perform an action…

Nicole Janeway Bills

Data Scientist at Atlas Research in Washington, DC | Certified Data Management Professional | www.facebook.com/groups/cdmpstudygroup

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store