MAP
REDUCE
Example 1: given an asymmetrical social network (a → b), output users (i) having 2M followers, (ii) following less than 20 users, (iii) followed back by all users they follow
M1 (a, b):
# strategy - **can have 2 KV outputs** as long as they follow the same format
# strategy - use flags to filter results
output (a, (FOLLOWING, b))
output (b, (FOLLOWER, a))
R1 (x, V):
collect U_FOLLOWERS = set of all (FOLLOWER, *)
collect U_FOLLOWING = set of all (FOLLOWING, *)
if U_FOLLOWERS >= 2M and U_FOLLOWING < 20 and all items in U_FOLLOWING also in U_FOLLOWERS:
output (x, _)
Example 2: given D1 (people, location, start_time, end_time) and D2 (infected_people), output people that came in contact with infected people
M1 (name, loc, start, end):
output (name, ("WHERE", loc, start, end))
R1:
*identify or null*
M2 (name):
output (name, ("POSITIVE", _, _, _))
R2:
*identity or null*
# strategy - chaining results from MR1 + MR2 is **allowed** **because same KV format**
# specify that input is a union of tuples with value ("WHERE", *, *, *) U ("POSITIVE", *, *, *)
M3 (name, (CONDITION, loc, start, end)):
process by CONDITION
...