I want to make a program to recognize the digit in an image. I follow the tutorial in scikit learn .
I can train and fit the svm classifier like the following.
First, I import the libraries and dataset
from sklearn import datasets, svm, metrics digits = datasets.load_digits() n_samples = len(digits.images) data = digits.images.reshape((n_samples, -1))
Second, I create the SVM model and train it with the dataset.
classifier = svm.SVC(gamma = 0.001) classifier.fit(data[:n_samples], digits.target[:n_samples])
And then, I try to read my own image and use the function predict()
to recognize the digit.
I reshape the image into (8, 8) and then convert it to a 1D array.
img = misc.imread("w1.jpg") img = misc.imresize(img, (8, 8)) img = img[:, :, 0]
Finally, when I print out the prediction, it returns [1]
predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] ))) print predicted
Whatever I user others images, it still returns [1]
When I print out the "default" dataset of number "9", it looks like:
My image number "9" :
You can see the non-zero number is quite large for my image.
I dont know why. I am looking for help to solve my problem. Thanks
6 Answers
Answers 1
My best bet would be that there is a problem with your data types and array shapes.
It looks like you are training on numpy arrays that are of the type np.float64
(or possibly np.float32
on 32 bit systems, I don't remember) and where each image has the shape (64,)
.
Meanwhile your input image for prediction, after the resizing operation in your code, is of type uint8
and shape (1, 64)
.
I would first try changing the shape of your input image since dtype conversions often just work as you would expect. So change this line:
predicted = classifier.predict(img.reshape((1,img.shape[0]*img.shape[1] )))
to this:
predicted = classifier.predict(img.reshape(img.shape[0]*img.shape[1]))
If that doesn't fix it, you can always try recasting the data type as well with
img = img.astype(digits.images.dtype)
.
I hope that helps. Debugging by proxy is a lot harder than actually sitting in front of your computer :)
Edit: According to the SciPy documentation, the training data contains integer values from 0 to 16. The values in your input image should be scaled to fit the same interval. (http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits)
Answers 2
1) You need to create your own training set - based on data similar to what you will be making predictions. The call to datasets.load_digits()
in scikit-learn is loading a preprocessed version of the MNIST Digits dataset, which, for all we know, could have very different images to the ones that you are trying to recognise.
2) You need to set the parameters of your classifier properly. The call to svm.SVC(gamma = 0.001)
is just choosing an arbitrary value of the gamma parameter in SVC, which may not be the best option. In addition, you are not configuring the C parameter - which is pretty important for SVMs. I'd bet that this is one of the reasons why your output is 'always 1'.
3) Whatever final settings you choose for your model, you'll need to use a cross-validation scheme to ensure that the algorithm is effectively learning
There's a lot of Machine Learning theory behind this, but, as a good start, I would really recommend to have a look at SVM - scikit-learn for a more in-depth description of how the SVC implementation in sickit-learn works, and GridSearchCV for a simple technique for parameter setting.
Answers 3
It's just a guess but... The Training set from Sk-Learn are black numbers on a white background. And you are trying to predict numbers which are white on a black background...
I think you should either train on your training set, or train on the negative version of your pictures.
I hope this help !
Answers 4
If you look at: http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits
you can see that each point in the matrix as a value between 0-16.
You can try to transform the values of the image to between 0-16. I did it and now the prediction works well for the digit 9 but not for 8 and 6. It doesn't give 1 any more.
from sklearn import datasets, svm, metrics import cv2 import numpy as np # Load digit database digits = datasets.load_digits() n_samples = len(digits.images) data = digits.images.reshape((n_samples, -1)) # Train SVM classifier classifier = svm.SVC(gamma = 0.001) classifier.fit(data[:n_samples], digits.target[:n_samples]) # Read image "9" img = cv2.imread("w1.jpg") img = img[:,:,0]; img = cv2.resize(img, (8, 8)) # Normalize the values in the image to 0-16 minValueInImage = np.min(img) maxValueInImage = np.max(img) normaliizeImg = np.floor(np.divide((img - minValueInImage).astype(np.float),(maxValueInImage-minValueInImage).astype(np.float))*16) # Predict predicted = classifier.predict(normaliizeImg.reshape((1,normaliizeImg.shape[0]*normaliizeImg.shape[1] ))) print predicted
Answers 5
Hi in addition to @carrdelling respond, i will add that you may use the same training set, if you normalize your images to have the same range of value. For example you could binaries your data ( 1 if > 0, 0 else ) or you could divide by the maximum intensity in your image to have an arbitrary interval [0;1].
Answers 6
You probably want to extract features relevant to to your data set from the images and train your model on them. One example I copied from here.
surf = cv2.SURF(400) kp, des = surf.detectAndCompute(img,None)
But the SURF features may not be the most useful or relevant to your dataset and training task. You should try others too like HOG or others.
Remember this more high level the features you extract the more general/error-tolerant your model will be to unseen images. However, you may be sacrificing accuracy in your known samples and test cases.
0 comments:
Post a Comment