Unit 1: Introduction to Excel for Data Analysis
        
        
            1. Introduction to Microsoft Excel and its Features
            
            What is Excel?
            Microsoft Excel is a spreadsheet program used to record, organize, analyze, and visualize quantitative data. It is a fundamental tool for data analysis.
            Key Features and Components:
            
                - Workbook: The entire Excel file (e.g., analysis.xlsx).
- Worksheet (Sheet): A single "tab" within a workbook. A workbook can contain multiple worksheets.
- Cell: The intersection of a row and a column, where data is entered (e.g., A1,B2).
- Ribbon: The toolbar at the top (containing tabs like 'Home', 'Insert', 'Data', 'Formulas').
- Formula Bar: The fxbar where you can see and edit the contents or formula of the active cell.
- Formulas and Functions: Excel's core strength. Formulas (e.g., =A1+B1) and built-in functions (e.g.,=SUM(A1:A5)) perform calculations.
- Charts: Tools to visualize data (covered in Unit 2).
- Data Tools: Features for managing data like sorting, filtering, and pivot tables.
        
            2. Basic Data Entry and Editing Techniques
            
            Data Entry
            
                - Entering Data: Click a cell and type. Press Enter to move down or Tab to move right.
- Data Types: Excel recognizes text (e.g., 'Male'), numbers (e.g., 10.5), dates (e.g., '22/10/2025'), and logical values (TRUE/FALSE).
- Fill Handle: The small square at the bottom-right of a selected cell. Drag it to copy data or extend a series (e.g., 'Jan', 'Feb', 'Mar'...).
Editing Techniques
            
                - Edit a Cell: Press F2, or double-click the cell, or edit directly in the formula bar.
- Clear Contents: Select cell(s) and press the Delete key.
- Formatting: Use the Home tab to change font, color, alignment, and (most importantly) number formats (e.g., Currency, Percentage, Date).
- Find and Replace: Use Ctrl+F (Find) or Ctrl+H (Replace) to quickly find or change data.
        
            3. Importing and Exporting Data
            
            Importing Data
            Often, your data is in an external file, like a .csv (Comma Separated Values) or text file.
            
                - Go to the Data tab.
- In the Get & Transform Data group, select From Text/CSV.
- Browse for your file and click Import.
- A preview window appears. Check the Delimiter (e.g., Comma, Tab) to ensure Excel is splitting the columns correctly.
- Click Load.
Exporting Data
            You can save your Excel sheet in other formats.
            
                - Go to File > Save As.
- Choose a location.
- In the "Save as type" dropdown menu, select the format you need (e.g., CSV (Comma delimited), PDF).
- Click Save.
        
            4. Sorting and Filtering Data
            
            Sorting
            Arranging your data in a specific order (e.g., A-Z, Smallest to Largest).
            
                - Select your entire data range (including headers).
- Go to the Data tab > Sort & Filter group.
- Click the Sort button for multi-level sorting (e.g., sort by 'State', then by 'City').
- Check "My data has headers".
- Choose the column to sort by, the sort criteria (Values, Color, etc.), and the order (A-Z).
                Common Pitfall: Always select *all* your data before sorting a single column, or you will scramble your records!
            
            Filtering
            Hiding rows that do not meet your criteria, allowing you to focus on a subset of data.
            
                - Select your data range (or just the header row).
- Go to the Data tab > Sort & Filter group.
- Click the Filter button. Dropdown arrows will appear on your headers.
- Click the arrow on a header to see filter options. You can uncheck values you don't want to see, or use Number Filters (e.g., "Greater Than", "Top 10").
        
            5. Removing Duplicates
            This tool permanently deletes rows that are identical based on the columns you select.
            
                - Select the data range.
- Go to the Data tab > Data Tools group.
- Click Remove Duplicates.
- A dialog box appears. Check the columns you want Excel to use to identify a duplicate row. (e.g., if you only check 'Email', it will remove any row with the same email, even if the names are different).
- Click OK. Excel will report how many duplicates were found and removed.
        
            6. Data Validation
            A tool to restrict what can be entered into a cell, ensuring data integrity and preventing errors.
            
                - Select the cell(s) where you want to apply the rule.
- Go to the Data tab > Data Tools group.
- Click Data Validation.
- In the Settings tab:
                    
                        - Allow: Choose the data type (e.g., Whole number, List, Date).
- Data: Set the criteria (e.g., "between", "greater than").
- Example: To allow only 'Male' or 'Female' in a cell, choose Allow: List and in Source: type Male,Female.
 
- (Optional) In the Input Message tab, you can write a prompt for the user.
- (Optional) In the Error Alert tab, you can write a custom error message if invalid data is entered.
        
            7. Handling Missing Observations
            Real-world data is often incomplete ("missing"). We cannot simply run calculations on cells that are blank or contain errors (like #N/A).
            
            Methods for Handling Missing Data:
            
                - Find Blank Cells:
                    
                        - Select your data column.
- Go to Home > Find & Select > Go To Special...
- Select Blanks and click OK. All blank cells will be highlighted. You can then color them or enter a value.
 
- Deletion:
                    
                        - Listwise Deletion: Delete the entire row. This is easy but can remove a lot of good data.
 
- Imputation (Replacing the value):
                    
                        - Mean/Median Imputation: Replace the missing value with the mean or median of the entire column. This is a common statistical technique.
                            
                                - Calculate the mean of the column (e.g., =AVERAGE(A2:A100)).
- Use the "Go To Special... Blanks" method above.
- Type the mean value (e.g., 50.5) and press Ctrl+Enter to fill all highlighted blank cells at once.
 
 
- Using Functions that Ignore Blanks:
                    
                        - Functions like AVERAGE,SUM,MEDIAN,STDEV.S, etc., automatically ignore blank cells.
- If your "missing" data is text (e.g., "NA"), these functions will fail. Use AVERAGEA(which treats text as 0) or filter/replace "NA" first.