Code Extender | Google BigQuery

Employee Information Analysis

Analysing employee information to calculate total number of employees, earliest employment date and employment duration for each department


Empty image or helper icon

Prompt

Extend Here's a tricky one for you, Code Extender AI tool. Generate a code snippet to perform the following complex operations using Google BigQuery SQL:

1. Create a table in Google BigQuery with the name `employee_infos`. This table should include the following fields: `employee_id` (numeric), `first_name` (string), `last_name` (string), `employment_date` (date), `department` (string).

2. Insert some mock data into the `employee_infos` table for testing.

3. Write a query to select all records from the `employee_infos` where `employment_date` is on or after '2020-01-01'.

4. Group the selected records by the `department` field, and for each department output, the total number of employees and the earliest `employment_date`.

5. Define a user-defined function within this SQL that calculates the duration of employment in years for each employee and use it within the SQL query you wrote above to also include the employment duration in the final output. 

Remember, all these should be accomplished using Google BigQuery SQL. by 

Answer

-- Creating the employee_infos table
CREATE TABLE employee_infos (
  employee_id NUMERIC,
  first_name STRING,
  last_name STRING,
  employment_date DATE,
  department STRING
);

-- Inserting mock data into the employee_infos table
INSERT INTO employee_infos (employee_id, first_name, last_name, employment_date, department)
VALUES (1, 'John', 'Doe', '2020-01-15', 'Marketing'),
       (2, 'Jane', 'Smith', '2020-02-20', 'HR'),
       (3, 'Mike', 'Johnson', '2019-12-10', 'Finance'),
       (4, 'Emily', 'Williams', '2020-05-03', 'IT');

-- Query to select all records with employment_date on or after '2020-01-01'
SELECT * 
FROM employee_infos
WHERE employment_date >= '2020-01-01';

-- Query to group records by department and calculate total number of employees and earliest employment_date
SELECT department, COUNT(*) as total_employees, MIN(employment_date) as earliest_employment_date
FROM employee_infos
WHERE employment_date >= '2020-01-01'
GROUP BY department;

-- User-defined function to calculate employment duration in years
CREATE TEMP FUNCTION calculate_duration(employment_date DATE) AS (
  DATE_DIFF(CURRENT_DATE(), employment_date, YEAR)
);

-- Final query including employment duration
SELECT department, COUNT(*) as total_employees, MIN(employment_date) as earliest_employment_date, 
       calculate_duration(employment_date) as employment_duration
FROM employee_infos
WHERE employment_date >= '2020-01-01'
GROUP BY department;

Explanation:

  • First, we create a table called employee_infos using the CREATE TABLE statement. This defines the structure of the table with the specified fields: employee_id, first_name, last_name, employment_date, and department.
  • Then, we insert mock data into the employee_infos table using the INSERT INTO statement. This adds sample records for testing purposes.
  • Next, we write a simple SELECT query to retrieve all records from the employee_infos table where the employment_date is on or after '2020-01-01'.
  • To group the selected records by the department field and calculate the total number of employees and the earliest employment_date, we use the GROUP BY clause along with the aggregate functions COUNT() and MIN().
  • After that, we define a user-defined function called calculate_duration using the CREATE FUNCTION statement. This function calculates the employment duration in years by taking the employment_date as input and using the DATE_DIFF() function with the YEAR unit.
  • Finally, we modify the previous query to include the employment duration by calling the calculate_duration() function. This adds the employment_duration column to the final output.

Note: This code is written in Google BigQuery SQL.

Create your Thread using our flexible tools, share it with friends and colleagues.

Your current query will become the main foundation for the thread, which you can expand with other tools presented on our platform. We will help you choose tools so that your thread is structured and logically built.

Description

This project involves analyzing a table called "employee_infos" which contains information about employees such as their employee ID, first name, last name, employment date, and department. The project includes creating the table, inserting mock data, and writing queries to retrieve relevant information such as the total number of employees, earliest employment date, and employment duration for each department. The queries also involve grouping the records by department and using aggregate functions to calculate the required information. Additionally, a user-defined function called "calculate_duration" is created to calculate the employment duration in years for each employee.