BYOMSPM

Build-Your-Own Master’s Degree in Product Management

Find here my thoughts on a collection of podcasts, articles, and videos related to product management, organized like a semester of a Master’s degree.

Module 3 / Tech / Statistical Relationships



This post is a refresher on some common statistical relationships. Definitions and explanations have been purposefully simplified (see the sources at the bottom for deeper explanations).

Grade I gave myself for this assignment: 90/100 (short but dense)

First, the basics:

Correlation – a relationship between two events

Causation – a relationship between two events in which one event causes the other

How to prove causation – the best way I found is to run a controlled, randomized experiment where you randomly split the subjects into two groups and treat them both the same except for one variable (the one you’re trying to prove causes something).

3 criteria for causation, according to John Stuart Mill (Duckworth):

  1. Temporal Precedence – “cause” event happens before “effect” event
  2. Covariance – confirmation of relation between two events
  3. Disqualification of Alternative Explanations

Different ways to measure the relationship between two variables (in admittedly and purposefully simple terms):

  • Covariance is a “quantitative measure of the degree to which the deviation of one variable (X) from its mean is related to the deviation of another variable (Y) from its mean” (Ejaz, 2023). Covariance values can range from negative infinity to infinity, where positive values indicate that two variables change in the same direction, negative values indicate that they change in opposite directions, and zero indicates no relationship (Ejaz, 2023).
  • Correlation similarly measures the degree of relation like covariance, but it also measures the strength of the relationship. The correlation coefficient has a range of [-1,1] where values of -1 or 1 indicate “perfect” correlation, positive values indicate that an increase in one variable relates to an increase in the other, and negative values indicate that an increase in one variable relates to a decrease in the other. “In simple terms, correlation refers to the scaled version of covariance” (Ejaz, 2023).

Lastly, I looked at 4 types of data science (from Paleri, 2023):

  • Descriptive: uses analysis of data to understand and explain what happened in the past
  • Inferential: use data findings about small population to generalize to a larger one
  • Predictive: “use of data to predict future events” (Paleri) (lots of AI/ML, including recommender systems, would be here)
  • Prescriptive: analysis of data in order to make decisions or prescribe actions

Thanks for reading.


Works Cited

Duckworth, Angela Lee et al. “Establishing Causality Using Longitudinal Hierarchical Linear Modeling: An Illustration Predicting Achievement From Self-Control.” Social psychological and personality science vol. 1,4 (2010): 311-317. doi:10.1177/1948550609359707.

Ejaz, Nimra. “What’s the Difference Between Covariance and Correlation?” Career Foundry. 31 August 2023. https://careerfoundry.com/en/blog/data-analytics/covariance-vs-correlation/.

Paleri, Jayadev. “4 types of Data Science / Analysis.” LinkedIn. 21 February 2023. https://www.linkedin.com/pulse/4-types-data-science-analysis-jayadev-paleri/.


Leave a comment