Akka实战：开发一个多线程新闻爬虫

akka, cassandra, crawler, scala

代码：https://github.com/yangbajing/crawler-service

使用Scala开发一个多线程爬虫，利用Akka库来管理多个爬虫任务的分散和聚合操作。同时使用scheduleOnce来设置爬取任务在指定时间内完成。详细需求如下：

可同时从多个新闻源（搜索引擎）检索新闻
已爬取过的新闻存库，第二次访问时直接从库里读取
提供duration参数，调用方可设置调用超时。超时到，则Server返回已爬取新闻。且爬虫任务需继续进行，并完成存库