The manager process needs not perform any computational task. Please calculate the speedup with respect to the number of worker processes only, i.e., for e.g. P=7 start one manager process and 7 worker processes. Better use P=7,15,31,etc. such that the P+1 processes utilize full blades (alternatively, since the manager is presumably mostly idle, you might also pin the manager and the first worker to the same core and use P=8,16,32,...).