My question is based upon this.
- Would it be possible more detailed comments/explain code starting line
tf = HashingTF().transform( training_raw.map(lambda doc: doc["text"], preservesPartitioning=True))
- How could I print the confusion matrix?
What does below error mean? How can I fix it? The model still gets built and I get predictions
>>> # Train and check ... model = NaiveBayes.train(training) [Stage 2:=============================> (2 + 2) / 4]16/04/05 18:18:28 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS 16/04/05 18:18:28 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
How could I print results for the new observation. I tried and failed
>>> model.predict("love") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "c:\spark-1.6.1-bin-hadoop2.6\spark-1.6.1-bin-hadoop2.6\python\pyspark\mllib\classification.py", line 594, in predict x = _convert_to_vector(x) File "c:\spark-1.6.1-bin-hadoop2.6\spark-1.6.1-bin-hadoop2.6\python\pyspark\mllib\linalg\__init__.py", line 77, in _convert_to_vector raise TypeError("Cannot convert type %s into Vector" % type(l)) TypeError: Cannot convert type <class 'str'> into Vector
1 Answers
Answers 1
1.hashingTF in spark is similiar to the scikitlearn HashingVectorizer. training_raw is an rdd of text.For a detailed explanation of the available vectorizers in pySpark see Vectorizers. For a complete example see this post
2.BLAS is the Basic Linear Algebra Subprograms library. You can check out this page on github for a potential solution.
3.You are trying to use model.predict on a string ("love"). You must first convert the string to a vector. A simple example that takes a dense vector string and outputs a dense vector with label is
def parseLine(line): parts = line.split(',') label = float(parts[0]) features = Vectors.dense([float(x) for x in parts[1].split(' ')]) return LabeledPoint(label, features)
You are probably looking for a sparse vector. So try Vectors.sparse.
0 comments:
Post a Comment