spark Pi && word count计算

方法:蒙特卡罗法,又叫随机抽样或统计

步骤

1.构造一个边长为1的正方形和1/4的圆(正方形面积1大于圆面积π/4)

2.随机向正方形内随机找n个点,计算每一个点到圆心的距离,小于1的就是圆内的点,假设数量是count

3. 4*count/n的值就是π的值,spark中的pi就是用这种方法算的

代码语言:javascript
复制
val sparkSession = SparkSession.builder().master("local).getOrCreate()
val sc = sparkSession.sparkContext
val slices = 6
val n = 600000
val count = spark.parallelize(1 to n, slices).map { i =>
    val x = random * 2 - 1
    val y = random * 2 - 1
    if ( x*x + y*y < 1) 1 else 0
   }.reduce(_ + _)
val pi = 4.0 * count / n
println(pi)
sparkSession.stop()

代码语言:javascript
复制
object WordCount {

def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("wordcount")
val sc = new SparkContext(conf)
val input = sc.textFile("/data/spark/demo/word_count")
val lines = input.flatMap(line => line.split(" "))
val count = lines.map(word => (word, 1)).reduceByKey(_ + _)
val output = count.saveAsTextFile("/data/spark/demo/word_count_result")
}
}

File - Project Structure - Artifacts - "+" - Jar - from modules
Build - Build Artifacts

./bin/spark-submit
--mater spark://localhost
--class WordCount /data/spark/demo/jar/spark-demo.jar

参考

https://www.cnblogs.com/aze-003/p/5127192.html