검색하다보니 filter와 exists 컬렉션 API에 대한 성능 주의 사항이 좀 있다.

그선.. 스크랩을 해둔다. 




1. filter

https://www.sumologic.com/blog-technology/3-tips-for-writing-performant-scala/


Using lazy collections must be taken with a grain of salt — while lazy collections often can improve performance, they can also make it worse.  For example:

def nonview = (1 to 5000000).map(_ % 10).filter(_ > 5).reduce(_ + _)
def view = (1 to 5000000).view.map(_ % 10).filter(_ > 5).reduce(_ + _)
view rawgistfile1.scala hosted with ❤ by GitHub

For this microbenchmark, the lazy version ran 1.5x faster than the strict version.  However, for smaller values of n, the strict version will run faster. Lazy evaluation requires the creation of an additional closure. If creating the closures takes longer than creating intermediate collections, the lazy version will run slower. Profile and understand your bottlenecks before optimizing!


filter가 새로운 콜렉션을 내부적으로 생성하기 때문에

view.filter를 사용하면 lazy 코드로 1.5배 빠르다고 한다.


또한, filter는 컬렉션을 모두 순회하는 linear time의 오퍼레이션이다.





2. exists



http://stackoverflow.com/questions/16443177/scala-which-data-structures-are-optimal-in-which-siutations-when-using-contai


With exists, you really just care about how fast the collection is to traverse--you have to traverse everything anyway. There, List is usually the champ (unless you want to traverse an array by hand), but only Set and so on are usually particularly bad (e.g. exists on List is ~8x faster than on a Set when each have 1000 elements). The others are within about 2.5x of List(usually 1.5x, but Vector has an underlying tree structure which is not all that fast to traverse).



exist는 컬렉션을 모두 순회하는 선형 시간의 오퍼레이션이다. 그런데, List.exists가 Set.exists보다 8배 이상 빠르다고 한다. 다른 컬렉션보다 Lists.exists가 1.5배에서 2.5배 빠르다고 한다.




Posted by '김용환'
,