本文共 3318 字,大约阅读时间需要 11 分钟。
vma目前有四种内存分配策略,这四种用法如下: * Support four policies per VMA and per process: * * The VMA policy has priority over the process policy for a page fault. * * interleave Allocate memory interleaved over a set of nodes, * with normal fallback if it fails. * For VMA based allocations this interleaves based on the * offset into the backing object or offset into the mapping * for anonymous memory. For process policy an process counter * is used. * * bind Only allocate memory on a specific set of nodes, * no fallback. * FIXME: memory is allocated starting with the first node * to the last. It would be better if bind would truly restrict * the allocation to memory nodes instead * * preferred Try a specific node first before normal fallback. * As a special case NUMA_NO_NODE here means do the allocation * on the local CPU. This is normally identical to default, * but useful to set in a VMA when you have a non default * process policy. * * default Allocate on the local node first, or when on a VMA * use the process policy. This is what Linux always did * in a NUMA aware kernel and still does by, ahem, default.在numa_policy_init 这个函数中会将所有numa节点的内存分配策略设置为MPOL_PREFERRED,即有限从指定的numa节点上分配内存。void __init numa_policy_init(void){ nodemask_t interleave_nodes; unsigned long largest = 0; int nid, prefer = 0; #设置每个numa 节点的内存策略为MPOL_PREFERRED for_each_node(nid) { preferred_node_policy[nid] = (struct mempolicy) { .refcnt = ATOMIC_INIT(1), .mode = MPOL_PREFERRED, .flags = MPOL_F_MOF | MPOL_F_MORON, .v = { .preferred_node = nid, }, }; } /* * Set interleaving policy for system init. Interleaving is only * enabled across suitably sized nodes (default is >= 16MB), or * fall back to the largest node if they're all smaller. */ #下面两种情况设置numa分配策略为interleave_nodes nodes_clear(interleave_nodes); for_each_node_state(nid, N_MEMORY) { unsigned long total_pages = node_present_pages(nid); /* Preserve the largest node */ if (largest < total_pages) { largest = total_pages; prefer = nid; } /* Interleave this node? */ #numa节点size 大于16M if ((total_pages << PAGE_SHIFT) >= (16 << 20)) node_set(nid, interleave_nodes); } /* All too small, use the largest */ #如果所有的节点size 都要与16M,这样设置为interleave_nodes,以便系统用最大numa节点的memory if (unlikely(nodes_empty(interleave_nodes))) node_set(prefer, interleave_nodes); #将前面已经更新好的interleave_nodes设为为MPOL_INTERLEAVE if (do_set_mempolicy(MPOL_INTERLEAVE, 0, &interleave_nodes)) pr_err("%s: interleaving failed\n", __func__); #检查是否开启numabalance check_numabalancing_enable();}实际使用过程中可以通过get_task_policy 来得到task的内存分配策略struct mempolicy *get_task_policy(struct task_struct *p){ struct mempolicy *pol = p->mempolicy; int node; #task中保存的mempolicy 不为null的话,则直接返回 if (pol) return pol; #numa_node_id中调用raw_cpu_read(numa_node)来根据cpuid来得到当前的numa id node = numa_node_id(); #如果task中的mempolicy为null,则返回numa_policy_init 中初始化的preferred_node_policy中当前task对应的node的mempolicy if (node != NUMA_NO_NODE) { pol = &preferred_node_policy[node]; /* preferred_node_policy is not initialised early in boot */ if (pol->mode) return pol; } return &default_policy;}
转载地址:http://mjnmi.baihongyu.com/