特色栏目： python 批处理 net编程 Javascript Php Asp Css Html5 Android seo centos

性能 – 通过数组操作有效搜索包含子排列的排列？

来源：互联网收集：自由互联发布时间：2021-06-22

我有一组整数,比如S = {1,…,10},还有两个矩阵N和M,它们的行是一些(但不一定是所有可能的)来自S阶的元素的排列,比如3和例如,分别为5 N = [1 2 3; 2 5 3; …],M = [1 2 3 4 5; 2 4 7 8 1; …]. 置换P的子

我有一组整数,比如S = {1,…,10},还有两个矩阵N和M,它们的行是一些(但不一定是所有可能的)来自S阶的元素的排列,比如3和例如,分别为5 N = [1 2 3; 2 5 3; …],M = [1 2 3 4 5; 2 4 7 8 1; …].

置换P的子置换Q仅是P的索引子集,使得Q的元素的索引的阶数与它们在P中的索引的阶数相同.示例：[2,4,7]是[2,3,4,6,7,1]的子置换,但[1,2,3]不是后者的子置换.

我需要一种有效的方法(例如尽可能矢量化并尽可能小的for循环)来查找

(1)来自M的所有排列,其具有来自N的子排列

和

(2)在M中找到N的每个子排列多少次.

到目前为止,我所拥有的是一个矢量化代码,用于检查M中是否包含给定的单个子置换(以及多少次),但是我必须使用通过N的parfor-loop,这对于非常大的N来说会变慢.注意,如果N不是太大,人们也可以通过简单地从给定的3元组构造允许的5元组并将结果与M进行比较来解决问题,但是如果N可以很快变得比简单的强制要慢得多.足够大了.

查看问题的另一种方法如下：检查其行的N个模数排列是否是一般意义上的M的子矩阵,即,是否可以通过删除M中的元素来获得N行的排列. .

抱歉,如果我的问题太基础,我的背景来自算术代数几何和表示理论,我对MATLAB很新.

编辑：
这是我检查M中单个k元组的存在的代码：
[码]

function [A,f] = my_function(x,M)
%// returns all rows in M that contain x and the absolute frequency of x in M
%// suboptimal for checking combinations rather than permutations byy at least ~ 50%
k = size(x,2);
m = size(M,1);
R = zeros(m,k);
I = R;
Z = I;
    for j = 1:k   
        [R(:,j),I(:,j)] = max((M == x(j)),[],2); 
        Z(:,j) = R(:,j).*I(:,j);
    end
z = zeros(m,k-1);
    for j = 1:(k-1)
        z(:,j) = (Z(:,j) > 0 & Z(:,j) < Z(:,j+1)); 
    end
[v,~] = find(sum(z,2) == k-1);    
A = M(v,:);
f = length(v);
end

使用此函数,通过N检查只是一个简单的(par)for循环问题,我希望避免使用更快的矢量化解决方案.

方法#1

[val,ind] = max(bsxfun(@eq,permute(M,[4 2 1 3]),permute(N,[2 3 4 1])),[],2)
matches = squeeze(all(diff(ind,1)>0,1).*all(val,1))
out1 = any(matches,2) %// Solution - 1
out2 = sum(matches,1) %// Solution - 2

方法#2

避免置换N的另一种方法可能对于longish N更好 –

[val,ind] = max(bsxfun(@eq,N,permute(M,[3 4 1 2])),[],4)
matches = squeeze(all(diff(ind,[],2)>0,2).*all(val,2))
out1 = any(matches,1) %// Solution - 1
out2 = sum(matches,2) %// Solution - 2

方法#3

用于大型数据的内存scroogey方法 –

out1 = false(size(M,1),1);  %// Storage for Solution - 1
out2 = zeros(size(N,1),1);  %// Storage for Solution - 2
for k=1:size(N,1)
    [val3,ind3] = max(bsxfun(@eq,N(k,:),permute(M,[1 3 2])),[],3);
    matches = all(diff(ind3,[],2)>0,2).*all(val3,2);
    out1 = or(out1,matches);
    out2(k) = sum(matches);
end

方法#4

GPU的内存scroogey方法 –

gM = gpuArray(M);
gN = gpuArray(N);

gout1 = false(size(gM,1),1,'gpuArray');  %// GPU Storage for Solution - 1
gout2 = zeros(size(gN,1),1,'gpuArray');  %// GPU Storage for Solution - 2
for k=1:size(gN,1)
    [val3,ind3] = max(bsxfun(@eq,gN(k,:),permute(gM,[1 3 2])),[],3);
    matches = all(diff(ind3,[],2)>0,2).*all(val3,2);
    gout1 = or(gout1,matches);
    gout2(k) = sum(matches);
end
out1 = gather(gout1);  %// Solution - 1
out2 = gather(gout2);  %// Solution - 2

现在,这种GPU方法已经吹走了所有其他方法.它使用M：320000X5和N：2100X3(与输入大小相同)运行,填充随机整数.使用GTX 750 Ti,仅用了13.867873秒！因此,如果您拥有足够内存的GPU,这也可能是您的赢家方法.

方法#5

适用于GPU的极其内存scroogey方法 –

gM = gpuArray(M);
gN = gpuArray(N);

gout1 = false(size(gM,1),1,'gpuArray');  %// GPU Storage for Solution - 1
gout2 = zeros(size(gN,1),1,'gpuArray');  %// GPU Storage for Solution - 2
for k=1:size(gN,1)
    [val2,ind2] = max(bsxfun(@eq,gM,permute(gN(k,:),[1 3 2])),[],2);
    matches = all(diff(ind2,[],3)>0,3).*all(val2,3);
    gout1 = or(gout1,matches);
    gout2(k) = sum(matches);
end
out1 = gather(gout1);  %// Solution - 1
out2 = gather(gout2);  %// Solution - 2

上一篇：垃圾收集会影响性能吗？
下一篇：【Code Tools】AB性能测试工具（二）

性能 – 通过数组操作有效搜索包含子排列的排列？

相关文章