How to do a linear regression with sklearn. We require the user to have a python anaconda environment already installed. Test that scikit-learn was correctly installed:: Step 2: Generate random linear data. We are going to choose fixed values of m and b for the formula y = x*m + b. Then with a random error of 1% will generate the random points. Usually you will not known in advance this information, we generate this data for teaching purposes. With this points we are going to use sklearn to create a linear regresion and verify how close we got to the fixed m and b values that we choose. In our code we first define the f function, which is a linear function. In particular this function adds +/- 1% of random error to the result, if not we are going to get a straight line. Then we generate 300 random point that we will use to train a model. If we plot the point we will get: Step 3: Use scikit-learn to do a linear regression. Now we are ready to start using scikit-learn to do a linear regression. Using the values list we will feed the fit method of the linear regression. Also we separate the data in two pieces: train and test. With thet in step 5 we are going to measure the error of the trained linear model. The output should be similar to: Step 4: Plot the result. Now we are going to plot the regression fitted: Step 5: Measure the error. The output should be similar to: Conclusion. Simple linear regression is a statistical method that allows us to summarize and study relationships between two or more continuous (quantitative) variables. It's a good idea to start doing a linear regression for learning or when you start to analyze data, since linear models are simple to understand. If a linear model is not the way to go, then you can move to more complex models. Linear regresion tries to find a relations between variables. Scikit-learn is a python library that is used for machine learning, data processing, cross-validation and more. In this tutorial we are going to do a simple linear regression using this library, in particular we are going to play with some random generated data that we will use to predict a model. When to use linear regression. When you think there is a relation between the variable to analyze (X and Y), like we see in figure 0. What we are going to fit is the slope (m) and y-interceptor (b), so we are going to get a function like: y = x*m + b. In this case the linear combination only has x since we are using 2D data, but the general linear model where y is the predicted model is: How to fix error Error: ImportError: numpy.core.multiarray failed to import. try to upgrade numpy with: If you are not using anaconda: How to fix ValueError: Found arrays with inconsistent numbers of samples. Check that the X values of the fit function is a list of list of possible values for each Y.
Комментариев нет:
Отправить комментарий