Code Simplifier

Probability Calculations and Visualization of Bookstore Sales

This document explains probability calculations for companies based on sector and size, and includes a Matplotlib code snippet to create a stacked bar graph representing bookstore sales over five years by genre.


Empty image or helper icon

Prompt

One hundred and seventy (170) companies from the JSE were randomly selected and classified 
by sector and size. The table below shows the frequencies for the two categorical random 
variables, ‘sector’ and ‘size’. 
Sector 
Company Size 
Small 
Row Total 
Medium 
Large 
Mining 
3 
8 
30 
Financial 
41 
9 
21 
42 
Service 
72 
10 
6 
8 
Retail 
24 
14 
13 
6 
Column Total 
33 
36 
48 
86 
170 
a. What is the probability that a randomly selected JSE company will be small and operate in the 
service sector?                                                                           
(3 marks) 
b. What is the probability that a randomly selected JSE company will be both small and medium
sized?                   
(3 marks) 
c. What is the probability that a randomly selected JSE company will be either a small company, 
a service sector company, or both?               
(3 marks) 
d. What is the probability that a randomly selected company is a retail company, given that it is 
known (in advance) to be a medium-sized company?             
(3 marks) 
e. What is the probability of selecting a small retail company from the JSE-listed sample 
companies? 
A bookstore categorises its sales into four main genres: Fiction, Non-Fiction, Children's Books, and 
Textbooks. The table shows data for a bookstore over the past five years.   
Year 
Fiction 
Non-Fiction 
2019 
Children’s Books Textbooks 
120 
95 
55 
2020 
110 
100 
65 
80 
2021 
85 
130 
110 
70 
2022 
75 
125 
115 
75 
2023 
90 
140 
120 
80 
Your task is to: 
95 
1. Create a stacked bar graph using matplotlib, where each bar represents a year, and the 
segments of each bar represent the sales for each book genre. 
2. Use a different colour for each genre and include a legend. 
3. Add appropriate labels for the x-axis (years) and y-axis (sales in thousands of dollars). 
4. Include a title for the graph. 
5. Add text labels on each stacked bar segment to show the exact sales figure for that genre and year

Answer

Simplified Code and Calculations

This document provides simplified calculations and a visualization task related to the data provided. The first section covers the computations for the probabilities associated with companies classified by their sector and size, and the second section details the creation of a stacked bar graph using Matplotlib.

Section 1: Probability Calculations

Given the <data_table> which has counts of companies as follows:

Sector/Size Small Medium Large Row Total
Mining 3 8 30 41
Financial 41 9 21 71
Service 72 10 6 88
Retail 24 14 13 51
Column Total 33 36 48 170

The total number of companies = 170.

Probability Calculations

  1. Probability of small and service sector (P(Small ∩ Service))
    P(Small and Service) = Number of small service companies / Total companies
    P(Small and Service) = 72 / 170 = 0.424

  2. Probability of both small and medium sized (P(Small ∩ Medium))
    Small companies = 33 (from column totals)
    Medium companies = 36, Total medium = 9 (Financial) + 10 (Service) + 14 (Retail) = 33
    P(Both small and medium) = 0 (as no company can be both) = 0

  3. Probability of small or service (P(Small ∪ Service))
    Using the formula:
    P(Small ∪ Service) = P(Small) + P(Service) - P(Small and Service)
    P(Small) = 33/170, P(Service) = 88/170
    P(Small or Service) = (33/170) + (88/170) - (72/170) = 0.406

  4. Probability of retail given medium (P(Retail | Medium))
    P(Retail | Medium) = P(Medium ∩ Retail) / P(Medium)
    = 14 / 36 = 0.389

  5. Probability of selecting a small retail company (P(Small and Retail))
    P(Small and Retail) = Number of small retail companies / Total companies
    Small companies = 24 (Retail)
    P(Small and Retail) = 24 / 170 = 0.141

Section 2: Stacked Bar Graph with Matplotlib

The following code provides a visual representation of bookstore sales over five years.

import matplotlib.pyplot as plt
import numpy as np

# Data
years = ['2019', '2020', '2021', '2022', '2023']
fiction = [120, 110, 85, 75, 90]
non_fiction = [95, 100, 130, 125, 140]
children_books = [55, 65, 110, 115, 120]
textbooks = [80, 70, 75, 80, 80]

# Stacked bar graph
bar_width = 0.5
indices = np.arange(len(years))

# Creating the bars
plt.bar(indices, fiction, bar_width, label='Fiction', color='b')
plt.bar(indices, non_fiction, bar_width, bottom=fiction, label='Non-Fiction', color='g')
plt.bar(indices, children_books, bar_width, bottom=np.array(fiction) + np.array(non_fiction), label='Children\'s Books', color='r')
plt.bar(indices, textbooks, bar_width, bottom=np.array(fiction) + np.array(non_fiction) + np.array(children_books), label='Textbooks', color='y')

# Adding labels and title
plt.xlabel('Years')
plt.ylabel('Sales in thousands of dollars')
plt.title('Bookstore Sales by Genre over 5 Years')
plt.xticks(indices, years)
plt.legend()

# Adding text labels on the bars
for i, (f, nf, cb, t) in enumerate(zip(fiction, non_fiction, children_books, textbooks)):
    plt.text(i, f/2, str(f), ha='center', va='center', color='white')
    plt.text(i, f + nf/2, str(nf), ha='center', va='center', color='white')
    plt.text(i, f + nf + cb/2, str(cb), ha='center', va='center', color='white')
    plt.text(i, f + nf + cb + t/2, str(t), ha='center', va='center', color='white')

plt.show()

Summary of Tasks

  • The code above calculates probabilities related to company sector and size data.
  • A stacked bar graph is created that visually represents bookstore sales per genre over five years using Matplotlib. Each segment has its sales figures labeled clearly on the graph.

This structured approach helps in understanding both statistical calculations and coding a visual representation effectively while delivering clear insights.

Create your Thread using our flexible tools, share it with friends and colleagues.

Your current query will become the main foundation for the thread, which you can expand with other tools presented on our platform. We will help you choose tools so that your thread is structured and logically built.

Description

This document explains probability calculations for companies based on sector and size, and includes a Matplotlib code snippet to create a stacked bar graph representing bookstore sales over five years by genre.