Linear Regression analyzes the influence of independent variables on a dependent variable (y=f(x)+e).
There are a number of features that help you to easily find the optimal model.
B-Box™ recommends the optimal model via the “all possible models” method.
You can easily explore alternative optimal models by transforming the x and y variables.
The outlier auto deletion feature identifies outliers using the criteria used in B-Box™, and generates regression models based on the data after the outliers have been removed.
You can easily review the basic assumptions of linear regression (normality, multicollinearity, linearity, homoscedasticity and autocorrelation) via relevant hypothesis tests.
You can easily check the predicting power of the model using test data, and review a number of models at the same time using a Z variable.
Spline regression analysis splits the independent variable into regions, and generates a separate regression model for each region.
You can generate regression models by manually assigning the splitting points (knot points) on the graph.
The automatic splitting feature generates regression models by automatically calculating the optimal knot points.
In addition, you can estimate the dependent variable for a new value of the independent variable from the model.
Lastly, you can review a number of models at the same time using a Z variable.
This is a statistical analysis which is used when the response is categorical. The analysis allows users to classify each observation and predict the probability of response. It is similar with linear regression but the major difference is that the dependent variables can be nominal.
If users have a number of independent variables and wish to find the best combination among them, B-Box™ recommends the optimal model via “All Possible Models” method. It considers all possible combinations of the independent variables and provides the optimal model whose all regression coefficients are valid and AIC is minimum.
Also B-Box™ calculates residuals and detects outliers, as a result of which users possibly increase the model accuracy.
Time Series Analysis is used to identify characteristics and predict future values from a set of data observed with the passing of time.
B-Box™ contains the moving average, exponential smoothing, decomposition and Winters models for time series analysis. Using the auto execution feature, you can run all the models together and view the results, and on this “All Possible” results tab, you can run any one of the models and view more detailed results.
For exponential smoothing and Winters models, the smoothing factors can be automatically calculated for your convenience.
In addition, you can easily check the predicting power of the model using test data, and review a number of models at the same time using a Z variable.
Cluster Analysis attempts to group together data points which share similar characteristics.
B-Box™ contains the PAM technique, a non-hierarchical method that is robust in dealing with outliers. For your convenience, the “All possible” feature allows you to enter a range of numbers of clusters rather than specifying a single number.
Moreover, silhouette scores are produced for each case, enabling you to see the changes in the silhouette scores when the clusters change, making it very convenient for you to run simulations to find suitable clusters.
Discriminant Analysis is used to determine to which category a data point belongs.
Discriminant analysis can be performed on every possible combination of variables (all possible models). You can see the apparent error rate of every possible model and run any one of the models to view more detailed results.
A screen is provided where you can categorize new data points, and the current categorization is visualized as a picture.
Factor Analysis attempts to identify hidden relationships between the variables and group them together.
You can obtain prompt results after selecting your variables and defining the number of factors.
If you chose the number of factors to be 2, the result of rotating the factors is visualized on a graph.
Principal Component Analysis reduces the number of variables, allowing you to explain your data in a simpler manner.
You can obtain prompt results after selecting your variables.
Scatter plots are provided for relationships between principal components, and between a principal component and an observable variable. Furthermore, to help you determine the number of principal components to consider, a scree plot is provided.
A graph, called a biplot, is provided to visualize the relationships between your data and multidimensional variables, obtained from principal component analysis.
Structural Equation Modeling identifies various causal relationships between the variables through a single model.
For your convenience in defining a model, B-Box™ contains the variable generation feature and the basic model generation feature that allow you to define variables easily. Moreover, B-Box™ also provides a process of validating the model before the calculations begin, to check whether the model as defined is mathematically solvable.
You can see your results at a glance on a path diagram, including the results of significance tests for the estimated coefficients (insignificant paths are highlighted as dotted red lines).
For each variable, possible outliers are detected by checking whether each data point falls within a specified standard deviation range from the mean. Basic statistics for the variable before and after the removal of outliers are provided.
You can review a number of datasets at the same time using a Z variable.
B-Box™ provides hypothesis tests for the following:
Mean of a single variable
Variance of a single variable
Difference of the means of two variables
Ratio of the variances of two variables
Paired difference of two variables
Means of 3 or more variables.
You can test a number of datasets at the same time using a Z variable.
Data Envelopment Analysis analyzes the efficiency of each data point (called Decision Making Units, or DMUs, in this context) by considering input and output variables.
For each DMU, not only is an efficiency score given, but you can also see how much the input variables can be reduced and how much the output variables can be raised. Negative values can be considered.
Introducing the concept of super efficiency, B-Box™ enables absolute evaluation of the efficiency of DMUs, as well as relative comparisons.
You can evaluate the efficiencies of a number of sets of DMUs at the same time using a Z variable.
You can find the optimal solution of an objective function under linear constraints, using solution algorithms for linear programs.
Analytic Hierarchy Process (AHP) determines an order of importance between factors considered in a decision making process, and evalutes the alternatives for each factor.
Following Professor T. L. Satty’s AHP methodology, weights are placed on each factor in a subjective decision making process.
Subjective responses undergo a consistency test, in order to ensure reliability of the results.
Present value is obtained by discounting future cash flows to take into account the time value of money, and investment decisions are assisted based on the net present value thus obtained..
Analysis for a survey tends to be repetitive and time consuming. The analysis through B-Box carries out this by providing various statistics, plots and tests for the reliability and hypothesis for equal mean/ variance.
Decision tree is a classification and prediction method using a tree-like graph or model, B-Box™ adopts CART algorithm which uses binary splitting of the tree nodes.
Users can see the pruning process with the tree plot. Using Z variable a number of models are created at once so that they can easily find the optimal tree model.
It converts contents from MS Word, MS Powerpoint, MS Excel, PDF, and websites into text file.Text Mining technique pictures keywords of the user's interest in a cloud form.
Users can add or delete words, and change the cloud shape and the text font based on the conditions they defined.