CI
Toggle navigation
    • Data Input
    • Data Evaluation
    • Transform Tools (Stack/Unstack)
    • Sample Size Calculator
    • Stat Analysis
    • Inferential Statistics
    • Visualization
    • Advanced Visualization
    • Process Control
    • Pareto Analysis
    • Capability Analysis
    • Non-Normal Capability
    • Distribution Fitting
    • Data Transformation
    • Measurement System Analysis
      • Attribute Agreement (Kappa)
      • Gage R&R (Continuous)
    • ANOVA
      • One-Way ANOVA
      • Two-Way ANOVA
      • Generalized ANOVA
    • Correlation Analysis
    • Regression Analysis
    • Logistic Regression
    • DOE Analysis
    AnalyticsTool ©2025 Dr.Mahmood Al Kindi - This tool may be used freely with acknowledgment to the original developer.

    Data Input

    Data Upload

    File Settings

    Download Template
    Download a sample data template

    Data Preview

    Data Summary

    
                            

    Data Selection

    📊 Data Quality Check

    Data Structure

    Missing Values by Column

    ✏️ Data Editor

    Quick Actions

    Download CSV

    Row Operations

    Column Operations


    Click any cell to edit (like Excel)

    🔍 Data Filtering

    Filter Your Data

    Select which rows to keep based on column values:




    Transform Tools —style Stack/Unstack/Subsets

    🧰 Data Transform

    • Unstack (Pivot wider)
    • Stack (Pivot longer)
    • Quick Subset

    Inputs

    Download CSV

    Preview

    Unstack: one categorical 'factor' (e.g., Supplier A/B/C), one 'value' column (e.g., Output). Optionally provide an index/key to align rows.
    
                                

    Inputs

    Download CSV

    Preview

    Stack: take multiple measurement columns and stack them into one long column, with an indicator of their origin.

    Condition

    Download CSV

    Preview

    Build simple one-condition subsets quickly. For complex filtering, use your main Filtering box.

    Sample Size Calculator

    Sample Size & Power Calculator

    Plan your study with confidence - supports t-tests, ANOVA, proportions, correlation & more

    1

    STEP 1: Analysis Type

    What are you comparing?

    2

    STEP 2: What to Calculate?

    Choose your unknown

    Most users want to find Sample Size needed for their study.
    3

    STEP 3: Basic Parameters

    Standard settings

    Type I error rate. Common: 5%, 1%, 10%
    Probability of detecting true effect. Common: 80%, 90%
    4

    STEP 4: Effect Size

    The difference you want to detect

    5

    STEP 5: Get Results

    Results update automatically

    Results calculate automatically as you change inputs

    (2 second delay after last change)

    RESULTS

    Power Curve

    Effect Size Guide: What Numbers Should I Use?

    What is Effect Size?

    Effect size = How big is the difference you want to detect?

    Think of it like this: If you're looking for a needle in a haystack...

    • Large effect: Looking for a sword (easy to find, need fewer samples)
    • Medium effect: Looking for a key (moderate difficulty)
    • Small effect: Looking for a needle (hard to find, need MANY samples)
    How to Choose Your Effect Size
    1. Best approach: Use historical data or pilot study to estimate the real difference
    2. Six Sigma projects: Start with medium effect size for process improvements
    3. When unsure: Use small effect size (conservative, ensures adequate power)
    4. Breakthrough changes: You can expect large effects (e.g., new technology vs old)

    Effect Size Reference Tables
    Cohen's d — For Comparing Means (t-tests)

    What it measures: The difference between two means, expressed in standard deviation units.

    Formula: d = (Mean₁ - Mean₂) / Standard Deviation

    Size d value Real-World Example
    Small 0.2 Height difference between 15 and 16 year old girls (~0.5 inch)
    Medium 0.5 Height difference between 14 and 18 year old girls (~1.5 inch)
    Large 0.8 Height difference between adult men and women (~2.5 inch)
    Six Sigma Tip: For process improvement projects comparing before/after, medium (0.5) is typical. Use large (0.8) only if you expect dramatic improvement (e.g., automation vs manual).
    Cohen's f — For ANOVA (3+ Groups)

    What it measures: The spread of group means relative to within-group variation.

    When to use: Comparing 3 or more groups (e.g., 3 machines, 4 suppliers, 5 shift teams).

    Size f value Real-World Example
    Small 0.10 Subtle difference between 4 suppliers (hard to notice visually)
    Medium 0.25 Noticeable difference between machines (visible in box plots)
    Large 0.40 Obvious difference between methods (anyone can see it)
    Six Sigma Tip: When comparing machines, shifts, or operators, start with medium (0.25) . If you're looking for any difference at all, use small (0.10) to be safe.
    Cohen's w — For Chi-Square (Categorical Data)

    What it measures: How much the observed proportions differ from expected.

    When to use: Contingency tables, testing independence (e.g., defect type vs shift, pass/fail vs supplier).

    Size w value Real-World Example
    Small 0.10 Slight preference in customer survey (51% vs 49%)
    Medium 0.30 Clear pattern in defect distribution (60% vs 40%)
    Large 0.50 Strong relationship (e.g., 75% defects from one machine)
    Six Sigma Tip: For Pareto analysis or defect categorization, use medium (0.30) . This detects meaningful patterns without requiring huge samples.
    Correlation (r) — For Relationship Strength

    What it measures: How strongly two variables move together (ranges from -1 to +1).

    When to use: Testing if X and Y are related (e.g., temperature vs yield, training hours vs performance).

    Size r value Real-World Example
    Small ±0.10 Weak link: coffee consumption vs productivity
    Medium ±0.30 Moderate link: study time vs exam scores
    Large ±0.50 Strong link: height vs weight, practice vs skill
    Six Sigma Tip: In root cause analysis, you often look for medium (0.30) correlations. Very high correlations (>0.7) may indicate obvious relationships or multicollinearity.
    Cohen's f² — For Regression (R² Significance)

    What it measures: How much variance in Y is explained by your predictors (X variables).

    Formula: f² = R² / (1 - R²)

    Size f² value Equivalent R² Meaning
    Small 0.02 ~2% Model explains little variance (but may still be useful)
    Medium 0.15 ~13% Model explains moderate variance (typical for social sciences)
    Large 0.35 ~26% Model explains substantial variance (strong predictive model)
    Six Sigma Tip: For DOE (Design of Experiments) transfer functions, aim for R² > 0.70 (f² > 2.3). For screening experiments, medium (0.15) is acceptable.

    Quick Decision Guide: Which Effect Size Should I Use?
    Don't know what to expect? → Use SMALL (conservative, won't under-power your study)
    Typical process improvement? → Use MEDIUM (most common in Six Sigma)
    Major change or new technology? → Use LARGE (breakthrough improvements)
    Have pilot data or historical data? → CALCULATE your actual expected effect size!
    Warning: Using a LARGE effect size when the true effect is small will result in an underpowered study (high risk of missing real effects)!

    How Effect Size Impacts Sample Size

    Example: Two-sample t-test at α=5%, Power=80%

    Effect Size Cohen's d n per group Total N
    Small 0.2 393 786
    Medium 0.5 64 128
    Large 0.8 26 52

    Notice: Detecting small effects requires 15x more samples than detecting large effects!

    Analysis

    Statistical Analysis

    Download HTML Report

    Visualization

    Results Summary

    
                                    

    Statistical Power Analysis

    Detailed Statistics

    Six Sigma Inferential Statistics Tool

    Statistical Analysis from Your Data
    Enter your sample data to calculate confidence intervals or test hypotheses.
    Perfect for DMAIC projects when you have collected measurements.

    Step 1: Choose Your Analysis


    Step 2: Enter Your Sample Data

    Step 3: Analysis Settings




    Results Summary

    Visual Results

    Detailed Analysis

    Six Sigma Interpretation

    Statistical Assumptions

    Download Report

    Data Visualization

    📊 Plot Mode

    Build custom visualizations with flexible variable selection
    Compare variables across different plot types side-by-side
    Overview of all variable distributions at once

    📋 Variable Selection

    X-Axis Variable(s)
    Y-Axis Variable(s) (Optional)
    🎨 Choose Plot Type
    📐 Layout Options
    ✨ Additional Mappings

    🔍 Comparison Setup

    📊 Distribution Overview

    🎨 Customize Your Plot

    💾 Export Plot

    Download

    Advanced Visualization

    📊 Advanced Plot Setup

    📋 Data Requirements
    🎯 Map Your Data Columns
    🎨 Additional Mappings

    ✏️ Customize Your Advanced Plot

    📥 Download CSV Template
    📁 Upload Your Data

    📋 Data Preview
    💾 Export Plot

    Download

    Statistical Process Control

    Control Chart Selection Guide

    Control Rules Selection

    Select which rules to detect out-of-control conditions:

    Rule Explanations:
    • Rule 1: Any point beyond control limits
    • Rule 2: Process shift or bias detected
    • Rule 3: Systematic trend in process
    • Rule 4: Excessive variation or overcontrol
    • Rule 5: Points near control limits
    • Rule 6: Process moving away from center

    Download Data Template
    Download a template CSV file for your control chart data
    Download Chart

    Control Charts

    Process Statistics

    
                          

    Out of Control Signals

    Pareto Analysis

    Pareto Analysis Settings

    Variable containing problem categories/defect types
    Variable containing count for each category. If not selected, categories will be counted
    Limit chart to the top N most frequent categories
    Download Chart

    Pareto Chart

    Analysis Results

    Pareto Summary

    
                            

    80/20 Analysis

    Process Capability Analysis

    Process Capability Analysis Settings

    Download Results Download HTML Report

    Process Capability Chart

    Capability Metrics

    Overall Capability

    Potential (Within) Capability

    Performance

    Z Benchmark

    Normal Probability Plot

    Process Performance Metrics

    Detailed Capability Analysis Results

    Non-Normal Capability Analysis

    Non-Normal Process Capability Analysis Settings

    Download Results Download HTML Report

    Non-Normal Process Capability Chart

    Non-Normal Capability Analysis Results

    Distribution Fitting Details:

    
                            

    About Non-Normal Capability Analysis

    Non-normal capability analysis uses fitted distributions to properly calculate capability indices when data doesn't follow a normal distribution. Standard Cp and Cpk indices can lead to incorrect conclusions with non-normal data.

    Metrics Provided:

    • Z-bench: Calculates process capability from percentiles of the fitted distribution
    • Pp(percentile): Process performance index based on percentiles
    • Ppk(percentile): Process performance index taking into account process centering
    • PPM (Parts Per Million): Expected defect rates based on the fitted distribution

    Distribution Selection:

    • Auto (Best Fit): Automatically selects the best-fitting distribution using Anderson-Darling statistic
    • Manual Selection: Choose a specific distribution that might be appropriate for your process

    Non-normal capability analysis is particularly important for processes with natural skewness, such as those with physical boundaries at zero (e.g., diameter, surface roughness).

    Distribution Fitting

    Distribution Fitting & Identification

    Identify the best-fitting distribution for your data - Minitab-style analysis

    1

    STEP 1: Select Data

    Choose numeric variable to analyze

    Data must be numeric with at least 8 observations
    2

    STEP 2: Select Distributions

    Choose distributions to fit (Minitab-style)

    • Normal Family
    • Weibull Family
    • Gamma Family
    • Extreme Value
    • Heavy Tailed
    • Bounded
    • Transforms
    Symmetric Distributions
    Reliability & Life Data
    Right-Skewed Distributions
    For Min/Max Data
    For Data with Outliers
    For Bounded Data
    Data Transformations
    Transformations convert non-normal data to normal

    Which to choose?
    3

    STEP 3: Options (Optional)

    Specification limits & settings

    Specification Limits (for Capability)
    Enter spec limits to calculate Ppk/Cpk

    Advanced Settings
    4

    STEP 4: Fit & Analyze

    Run distribution fitting



    Download Report

    DISTRIBUTION FITTING RESULTS


    Distribution Rankings

    Lower Anderson-Darling (AD) = Better fit. P-value > 0.05 = Cannot reject fit.

    Legend:
    ✅ Good Fit (p > 0.10)
    ⚠️ Acceptable (0.05 < p < 0.10)
    ❌ Poor Fit (p < 0.05)

    Interpretation Guide

    Histogram with Fitted Distributions

    Best fit shown as solid line. Others are dashed.

    Probability Plots

    Points should follow the diagonal line if distribution fits well.
    Q-Q Plot (Quantile-Quantile)
    P-P Plot (Probability-Probability)

    All Distributions - Q-Q Grid

    Parameter Estimates

    
                          
    All Parameters Summary:

    Percentile Estimates

    Percentiles are estimated from the best-fitting distribution.

    Inverse Lookup: Find Percentile for a Value
    
                        

    Process Capability (Non-Normal)

    Random Data Generation

    Generate random samples from the fitted distribution for simulation.

    Generated Data Summary:
    
                              
                                
                                Download Data
                              
                            
    Histogram of Generated Data:

    Distribution Selection Guide

    Continuous Data
    • Normal: Symmetric, bell-shaped
    • Lognormal: Right-skewed, positive values
    • Gamma: Right-skewed, waiting times
    Reliability/Lifetime
    • Weibull: Failure times, bathtub curve
    • Exponential: Constant failure rate
    • Loglogistic: Accelerated life testing
    Extreme Values
    • Gumbel: Maximum values
    • SEV: Minimum values
    • Pareto: Heavy-tailed phenomena
    Special Cases
    • Beta: Bounded [0,1] proportions
    • Uniform: Equal probability
    • Cauchy: Heavy tails, no mean

    Data Transformation

    Data Transformation Tools

    Specification Limits (Optional):
    Download Transformed Data

    Before and After Transformation

    Transformation Results

    
                            

    Normality Test Results:

    Transformed Specification Limits:

    Use these values for normal capability calculations:

    One-Way ANOVA Settings

    Note:
    • Select a numeric response variable (continuous outcome)
    • Select a categorical factor variable (groups to compare)
    • Numeric variables with ≤10 unique values are included as potential factors
    • For continuous predictors with >10 values, use regression analysis instead

    • Summary
    • Means Plot
    • Effect Sizes Plot
    • Diagnostics
    • Results

    📄 Download HTML Report

    This tab shows the ANOVA table, effect sizes, and statistical tests in formatted tables.
    Shows group means with confidence intervals, individual data points, and effect size information.
    Displays the percentage of variance explained by the factor vs. within-group variance based on eta-squared.
    Diagnostic plots to check ANOVA assumptions: normality, equal variances, etc.

    Two-Way ANOVA Settings

    Example Data
    Note:
    • Select a numeric response variable (continuous outcome)
    • Select two different categorical factor variables
    • Numeric variables with ≤10 unique values are included as potential factors
    • For continuous predictors with >10 values, use regression analysis instead
    • Interaction term tests if the effect of one factor depends on the other

    • Summary
    • Interaction Plot
    • Main Effects
    • Variance Plot
    • Diagnostics
    • Results

    📄 Download HTML Report

    This tab shows the Two-Way ANOVA table, variance components, and statistical tests in formatted tables.
    Shows the interaction between factors. Parallel lines indicate no interaction.
    Shows the main effect of each factor separately.
    Displays the percentage of variance explained by each source of variation.
    Diagnostic plots to check ANOVA assumptions: normality, equal variances, etc.

    📊 Generalized ANOVA Settings

    Variable Selection

    Model Options


    📄 Download Report

    📈 Analysis Results

    • 📊 Summary
    • 📈 Effects Plot
    • 📊 Main Effects
    • 🔍 Diagnostics
    • 📋 Model Details





    
                              

    ℹ️ Generalized ANOVA Information

    About Generalized ANOVA

    Generalized ANOVA allows you to analyze the relationship between one continuous response variable and multiple factors and/or covariates.

    • Factors: Categorical variables (groups)
    • Covariates: Continuous variables used as controls
    • Interactions: Test whether the effect of one variable depends on another
    • Model Types: Choose between ANOVA, Linear Model, or Mixed Effects approaches
    Model Interpretation
    • Main Effects: Individual contribution of each factor/covariate
    • Interaction Effects: Combined effects between variables
    • F-statistic: Test of significance for each effect
    • p-value < 0.05: Statistically significant effect

    Gage R&R (Continuous)

    Gage R&R Analysis


    Data Input

    New to Gage R&R?

    Download a template file to see the required data format:

    Download Template CSV

    The template includes:

    • Part column: Unique part identifiers
    • Operator column: Operator names/IDs
    • Measurement column: Numeric measurements
    • Multiple measurements per part-operator combination
    File should contain columns for Part, Operator, and Measurement
    Enter tolerance to calculate %Tolerance
    Crossed: Each operator measures the same parts (most common)
    Nested: Each operator measures different/unique parts

    Gage Evaluation


    ANOVA Results







    QCC Range Chart


    QCC Xbar Chart




    Generate Management Report

    Create a comprehensive HTML report for management presentation.


    Generate & Download Report

    Report Preview

    Click 'Generate & Download Report' to create and download a comprehensive HTML report with all analysis results and charts.

    Attribute Agreement Analysis (Gage R&R for Attributes)

    Data Setup


    Study Configuration


    Column Selection


    Analysis Options


    • Data Preview
    • Summary Report
    • Detailed Statistics
    • Disagreement Analysis
    • Visualizations
    • Report

    Attribute Agreement Analysis Report


    
                                          

    Within Appraiser Agreement

    Appraiser vs Standard Agreement

    Between Appraisers Agreement

    All Appraisers vs Standard

    Fleiss' Kappa Statistics

    Cohen's Kappa (Pairwise)

    Assessment Effectiveness

    Disagreement Summary by Part

    Disagreement Pattern Analysis

    Appraiser Bias Analysis

    Assessment Agreement Plot (Minitab Style)

    Agreement Chart

    Kappa Confidence Intervals

    Assessment Agreement Heatmap

    Professional Report



    Download Full Report (PDF) Download Results (Excel)

    Before Regression Analysis

    Check your data quality and assumptions before running regression

    Regression Diagnostics Tool Educational Edition

    This tool helps you understand and improve your linear regression model. Hover over icons for explanations.

    📊 Analysis Workflow

    Step 1: Upload Your Data
    Step 2: Select Variables
    Step 3: Run Analysis
    Step 4: Generate Report
    📄 Download HTML Report
    • 📊 Model Summary
    • 🔧 Technical Details
    • 📈 Correlation Analysis
    • 📚 Help & Learn

    🔍 Diagnostic Tests

    ⚠️ Influential Observations

    These rows have high influence on your model:


    📥 Download Influential Rows

    📊 Full Statistical Output

    complete output:

    
                              

    🔗 Correlation Analysis

    Understanding relationships between your variables helps identify multicollinearity and redundant predictors.

    📊 Correlation Matrix

    Values closer to ±1 indicate stronger linear relationships.

    🎯 Interpretation Guide
    • 0.0 to ±0.3: Weak correlation
    • ±0.3 to ±0.7: Moderate correlation
    • ±0.7 to ±1.0: Strong correlation
    • Values > ±0.8 between predictors suggest multicollinearity
    🌡️ Correlation Heatmap

    Color intensity shows correlation strength.

    ⚠️ High Correlations Alert
    📋 Detailed Correlation Matrix
    📥 Download Correlation Matrix

    🎓 Understanding Regression Diagnostics

    🎯 What is Regression Analysis?

    Regression analysis helps you understand relationships between variables and make predictions. It answers questions like "How does X affect Y?" and "What will Y be when X changes?"

    📊 Key Assumptions to Check

    • Linearity: The relationship is roughly straight-line
    • Independence: Observations are unrelated to each other
    • Homoscedasticity: Variance stays constant across predictions
    • Normality: Residuals follow a bell curve
    • No Multicollinearity: Predictors are not too highly correlated

    🚨 Red Flags to Watch For

    • R-squared below 0.3 (weak model fit)
    • VIF values above 10 (severe multicollinearity)
    • Significant p-values in diagnostic tests (< 0.05)
    • Cook's Distance above 1 (very influential points)

    💡 Tips for Better Models

    • Start simple - use fewer predictors initially
    • Check data quality - look for outliers and errors
    • Consider transformations if assumptions are violated
    • Always validate on new data when possible

    💡 Example Interpretations

    Regression and Correlation Analysis

    Regression and Correlation Analysis


    Correlation Highlighting

    Note: Non-numeric variables are automatically detected as categorical.

    Categorical Variables
    Choose which independent variables should be treated as categorical. Non-numeric variables are automatically detected and converted to dummy/indicator variables.
    Each X variable will be transformed to polynomial terms of the specified degree.

    Ridge Regression Parameters
    Lambda controls the amount of shrinkage. Higher values = more shrinkage.

    Advanced Analysis
    Download HTML Report
    
                                

    Correlation Matrix

    High Correlation Warnings

    Correlation Visualization

    Regression Plot

    Regression Summary

    
                              

    Variance Inflation Factor (VIF) - Multicollinearity Analysis

    VIF values indicate the degree of multicollinearity. Generally:

    • VIF = 1: No correlation
    • VIF < 5: Moderate correlation (acceptable)
    • VIF > 5: High correlation (concerning)
    • VIF > 10: Severe multicollinearity (problematic)

    Multiple Regression Summary

    
                              

    Variance Inflation Factor (VIF) - Multicollinearity Analysis

    VIF values indicate the degree of multicollinearity. Generally:

    • VIF = 1: No correlation
    • VIF < 5: Moderate correlation (acceptable)
    • VIF > 5: High correlation (concerning)
    • VIF > 10: Severe multicollinearity (problematic)

    Pareto Chart of Standardized Effects

    Bars extending beyond the reference line indicate statistically significant predictors at α = 0.05

    Ridge Regression Summary

    
                              

    Diagnostic Plots

    Regression Equation and Model Details


    Model Performance

    ANOVA Table

    Prediction Tool

    Enter Values for Prediction

    Prediction Results

    
                                  

    Logistic Regression Analysis

    Binary Logistic Regression Analysis

    Binary outcome variable must have exactly 2 unique values (e.g., 0/1, Yes/No, Success/Failure)
    Download HTML Report
    
                              

    📐 Model Equation

    🎯 Business Insights & Strategic Recommendations

    Model Summary

    
                            

    Model Performance Metrics

    📊 Model Coefficients & Business Impact Analysis


    🎯 Odds Ratios & Strategic Impact

    Confusion Matrix

    Classification Metrics

    Diagnostic Plots

    ROC Curve Analysis

    ROC Statistics

    
                                  

    🔮 Strategic Scenario Planning Tool

    Enter Values for Prediction

    Prediction Results

    
                                

    Design of Experiments (DOE) Analysis

    • Design Generator
    • Data Import
    • Model Setup
    • ANOVA & Effects
    • Diagnostics
    • Optimization
    • Prediction & ROI
    • Help & Guide

    Design Configuration



    Design Options

    Technical replicates: Each experimental run will be repeated this many times
    Specify how many different responses you will measure (e.g., Yield, Purity, Cost)

    Design Information

    Generated Design

    Download as CSV
    Download as Excel

    Design Properties

    Design Matrix Visualization

    Run Distribution


    Correlation Structure


    Design Efficiency Metrics

    
                                      

    Upload Data

    Choose CSV or XLSX File



    Debug Info:
    
                                        

    Data Quality

    Data Preview

    • Table View
    • Summary
    • Structure


    
                                            

    
                                            

    Variable Selection


    Select your response variable replicates (e.g., Response1, Response2, Response3) and experimental factors.

    Factor Type Configuration


    Categorical factors should be treated as factors. Continuous numeric variables will be fitted as continuous.

    Model Options


    Model Summary


    • Fitted Equation
    • R Formula
    • Coefficients
    • Model Fit
    • Data Used

    Fitted Model Equation:



    
                                            



    
                                            

    Type II ANOVA Table


    Download ANOVA

    Effect Sizes (Partial η²)


    Partial η² indicates the proportion of variance explained by each term.

    Main Effects Plot


    Shows the average response at each level of each factor.

    Two-Way Interaction Plot


    Non-parallel lines indicate interactions between factors.

    Diagnostic Interpretation


    Quick Reference
    • Residuals vs Fitted: Random scatter = good
    • Q-Q Plot: Points on line = normal
    • Scale-Location: Flat line = constant variance
    • Cook's Distance: Low values = no influential points

    All Diagnostic Plots

    Optimization Goal


    Optimal Solution


    
                                        
    
                                        
    Note: If your design is saturated, some coefficients may be aliased. Predictions are still valid.

    Response Surface Visualization




    Download Plot

    Predicted Response Grid


    Download Grid

    Single Point Prediction


    Prediction Result

    Six Sigma Quality Analysis

    Compare Two Settings

    Compare current process settings with proposed optimal settings to evaluate quality improvement.


    Current/Baseline Settings

    Proposed/Candidate Settings

    Specification Limits & Quality Parameters

    Define your specification limits to calculate process capability and defect rates.

    Cost Parameters (Optional)

    Six Sigma Quality Comparison Results


    • Quality Metrics
    • Cost Impact
    • Recommendation
    • Process Capability Chart





    DOE Analysis Platform - User Guide

    Quick Start

    1. Generate Design: Create your experimental design using the Design Generator
    2. Download Template: Export the design template to conduct experiments
    3. Run Experiments: Execute experiments and record responses
    4. Upload Data: Import your completed DOE dataset (CSV or XLSX format)
    5. Configure Model: Select response variable, factors, and optional blocking
    6. Fit Model: Click 'Fit Model' to run the analysis
    7. Review Results: Examine ANOVA, effects, and diagnostics
    8. Optimize: Find optimal factor settings
    9. Predict & ROI: Calculate predictions and return on investment

    Design Generation

    The Design Generator allows you to create various experimental designs including:

    • Factorial Designs: Full and fractional factorial designs for screening and characterization
    • Response Surface Designs: Central Composite (CCD) and Box-Behnken for optimization
    • Screening Designs: Plackett-Burman for many factors with few runs
    • D-Optimal Designs: Computer-generated designs for irregular regions or constraints

    After generating a design, download it as a template, run your experiments following the run order, and record your response values. Then upload the completed data for analysis.


    Data Format Requirements

    • Structure: Each row = one experimental run
    • Columns: Include factors (X variables), response (Y variable), and optionally a blocking variable
    • Format: CSV or Excel (.xlsx) files supported
    • Headers: First row should contain column names
    • Missing Values: Rows with missing values will be automatically removed
    Example Data Structure:
    Temperature, Pressure, Catalyst, Block, Yield
    180,       50,       A,        1,     85.2
    200,       50,       A,        1,     89.1
    180,       70,       B,        2,     87.5
    ...

    Feature Descriptions

    Design Generator

    • Design Types: Multiple design families for different objectives
    • Factor Specification: Define factor names and levels
    • Design Options: Add center points, replicates, and blocking
    • Efficiency Metrics: D-, A-, and G-efficiency calculations
    • Export Templates: Download as CSV or Excel with instructions

    ANOVA & Effects

    • Type II ANOVA: Tests significance of each factor/interaction
    • Effect Sizes: Partial R² shows variance explained by each term
    • Main Effects Plot: Shows average response at each factor level
    • Interaction Plot: Non-parallel lines indicate interactions

    Diagnostics

    • Residuals vs Fitted: Should show random scatter (no patterns)
    • Q-Q Plot: Points should follow diagonal line (normality check)
    • Scale-Location: Checks for constant variance (homoscedasticity)
    • Cook's Distance: Identifies influential observations

    Optimization

    • Objective: Choose to maximize, minimize, or hit a target
    • Grid Search: Evaluates all combinations of observed factor levels
    • Contour Plot: Visualizes response surface for two factors
    • Optimal Settings: Recommended factor levels to achieve goal

    Prediction & ROI

    • Point Prediction: Predict response at specific factor settings
    • Cost-Benefit: Compare baseline vs candidate settings
    • Economic Analysis: Calculate net benefit and ROI
    • Payback Period: Units needed to recover changeover costs

    Choosing the Right Design

    Design Selection Guide:
    • Full Factorial: Use when you have 2-4 factors and want to study all interactions
    • Fractional Factorial: Use for 5+ factors when main effects and 2-way interactions are of interest
    • Plackett-Burman: Use for screening 7+ factors to identify the vital few
    • Central Composite (CCD): Use for optimization after identifying important factors
    • Box-Behnken: Use for optimization when you want to avoid extreme corner points
    • D-Optimal: Use when you have constraints or an irregular experimental region

    About Blocking

    Blocking removes systematic variation from nuisance factors (e.g., Day, Operator, Batch). The block variable is included additively in the model (no interactions with factors). This improves precision by accounting for block-to-block variation.

    • Use blocking when experiments are conducted in groups that may differ
    • Examples: different days, batches, machines, or operators
    • Block effects are not typically of interest but improve analysis quality

    Tips & Best Practices

    • Design First: Always create your experimental design before collecting data to ensure efficiency
    • Randomization: Run experiments in randomized order to avoid systematic bias
    • Factor Types: Categorical factors (e.g., Catalyst A/B) should be marked as factors. Continuous variables (e.g., Temperature) can remain numeric.
    • Interactions: Start with 2-way interactions. Add higher-order only if justified by subject matter knowledge.
    • Sample Size: Ensure adequate replication. Rule of thumb: ≥3 replicates per factor combination.
    • Diagnostics: Always check diagnostic plots. Patterns indicate model problems or need for transformation.
    • Outliers: Investigate influential points (high Cook's distance) - they may be errors or important findings.
    • Transformations: If residuals show non-normality or heteroscedasticity, consider log or Box-Cox transformation of response.

    Statistical Concepts

    P-value Interpretation:
    • p < 0.001: *** (highly significant)
    • p < 0.01: ** (very significant)
    • p < 0.05: * (significant)
    • p < 0.10: . (marginally significant)
    • p ≥ 0.10: not significant
    R² and Adjusted R²:
    • R²: Proportion of variance explained by model
    • Adjusted R²: Accounts for number of predictors (preferred for model comparison)
    • Values closer to 1.0 indicate better fit
    RMSE (Root Mean Square Error):
    • Average prediction error in same units as response
    • Lower values indicate better fit
    • Compare to response standard deviation to assess model quality
    Design Efficiency:
    • D-efficiency: Maximizes determinant of information matrix (overall precision)
    • A-efficiency: Minimizes average variance of parameter estimates
    • G-efficiency: Minimizes maximum prediction variance
    • Values closer to 1.0 indicate more efficient designs

    Need Help?

    This application is designed for Design of Experiments (DOE) analysis with support for design generation, factorial designs, blocking, and economic optimization. For technical questions about DOE methodology, consult standard references such as Montgomery's 'Design and Analysis of Experiments' or Box, Hunter & Hunter's 'Statistics for Experimenters'.

    AnalyticsTool © 2025 Dr. Mahmood Al Kindi