Please complete the following steps
Part 1: View the following videos:
● MS Video: Use Formulas and Functions
● MS Video: Create and Format Pivot Tables and Pivot Charts
Part of data analysis often includes pulling data together from different sources. Often data need to be “cleaned up” in order to conduct effective analysis. For example, public companies file annual financial statements with the Securities and Exchange Commission (SEC) through 10-k filings. These are available for public download as Excel files on the SEC website called EDGAR. This allows researchers to combine data from many companies into one table for analysis. Yet companies are not consistent with how they name certain accounts or the formatting of their financial statements. Data analysts must “clean up” the data to make naming convention and formatting consistent in order to effectively combine and analyze the data.
Part 2: Gather your files
Go to EDGAR and download the FY22 10-K Excel files for Pfizer (PFE), Merck (MRK) and AbbVie (ABBV). To do so, you will need to:
1-Access the EDGAR Company Filings page, and enter the company name or ticker (e.g. “Merck” or “MRK”) in the Company and Person Lookup search bar.
2- On the company page, you can find the 10-K report in 1 of 2 places:
● under Selected Filings > 10-K (annual reports) or
● under Latest Filings > View Filings > and entering “10-K” in the Search table filter
3-Once you’ve located the 10-K filing, click the Filing icon to the right of the document link. This will open a new page. (Do not click the document itself.)
4-Click on the Interactive Data option next to the 10-k filing information. This will open a new page.
5-Click on View Excel Document. This will initiate the download of the 10-K report data.
6-Save/rename each of the three files to the same folder. Name the files based on company name, file type and year (e.g., MRK_10K_FY22).
7-In order to combine data, we will use an Excel function called VLOOKUP in Part 3, below.
Part 3: Clean the data and complete the VLOOKUP task
1. Read through how the VLOOKUP function works: Microsoft Excel VLOOKUP
2. Create a new Excel file.
3. Add the following columns.
a. Company Name
b. Year
c. Sales
d. Cost of Goods Sold
e. Gross Profit
f. Net Income
4. Use VLOOKUP to populate the Company Name from your files.
a.
For example, if using the naming convention above, the following would populate the Pfizer name:
=VLOOKUP(“Entity Registrant Name”,'[PFE_10K_FY22.xlsx]Document and Entity Information’!$A:$D, 2, FALSE)
The formula looks for the cell in the first row of the array with the value “Entry Registrant Name” and returns the value in the second column of the array in the file PFE_10K_FY22.xlsx and in the tab “Document and Entity Information. In this case the value in that cell is PFIZER INC.
You can see the power of VLOOKUP to quickly pull data from different sources. One could duplicate this formula and only change the file name to pull data from many sources, assuming the files are formatted the same – that is the same array, columns, tab names and lookup value apply.
5. Create nine total rows in your Excel file: three for each company. Use VLOOKUP to populate the company name (3 rows for each company), and then fill in the years FY22, FY21 and FY20. Each company should have three rows and three years when completed. The Find and Replace feature in Excel (Control H) can help you quickly replace company names within the formula.
6. We can use the same VLOOKUP function to pull data for Sales, Cost of Goods Sold and Net Income. Look through the Excel files for the three companies and identify some barriers to using VLOOKUP. For example, VLOOKUP works best when the files are formatted the same, the tab names are the same and the lookup value are the same. Is that the case with these files? Identify at least three challenges you see with using VLOOKUP.
7. This is the reality with many data sets such as 10-Ks. They are inconsistent between companies, and often inconsistent even within a company. It is often faster to keep the VLOOKUP formula consistent, and instead change the data sets to fit the VLOOKUP parameters. In this case we would have to:
a. Make the Consolidated Statements of Income tab have a consistent name
b. Make Sales, Cost of Goods Sold and Net Income consistent lookup values
c. Make sure the data we want to pull is within the same column
8. Use your Excel files with “cleaned up” data to practice VLOOKUP and populate the remaining values for Sales, Cost of Goods Sold, and Net Income for each company for years FY22, FY21 and FY20. (Note that the “cleaned up” files only include the Document and Entity Information and Consolidated Statements of Income tab, and that only those values needed for pulling data have been changed). Calculate Gross Profit based on sales and cost of goods sold. When you are complete you should have a data set that is 10 rows (with header) by 6 columns.
9. A useful tool to analyze time series data for year over year comparisons are pivot tables. Create a new sheet and use pivot tables to analyze year over year changes in Sales, Cost of Goods Sold, Gross Profit and Net Income.
10. In a new sheet, comment on some individual corporate and industry trends you see.
QUESTIONS:
1-Answer the following questions based upon your completion of the exercise.
A) What is the percentage difference from FY21 to FY22 for Annual Sales for Merck?
Select answer from the options below
23.43%
17.31%
21.72%
3.30%
B) What is the percentage difference from FY20 to FY21 for Cost of Goods Sold for Pfizer?
Select answer from the options below
11.43%
263.28%
-6.98%
13.38%
C) What is the percentage difference from FY21 to FY22 for Annual Gross Profit for AbbVie?
Select answer from the options below
30.95%
27.40%
4.87%
20.28%
D) Which company had the largest growth percentage from FY21 to FY22 for Net Income, Merck, Pfizer, or AbbVie?
Select answer from the options below
-Pfizer
-AbbVie
-Merck
-You cannot tell from the data provided.
2-