我有一组整数,比如S = {1,…,10},还有两个矩阵N和M,它们的行是一些(但不一定是所有可能的)来自S阶的元素的排列,比如3和例如,分别为5 N = [1 2 3; 2 5 3; …],M = [1 2 3 4 5; 2 4 7 8 1; …]. 置换P的子
置换P的子置换Q仅是P的索引子集,使得Q的元素的索引的阶数与它们在P中的索引的阶数相同.示例:[2,4,7]是[2,3,4,6,7,1]的子置换,但[1,2,3]不是后者的子置换.
我需要一种有效的方法(例如尽可能矢量化并尽可能小的for循环)来查找
(1)来自M的所有排列,其具有来自N的子排列
和
(2)在M中找到N的每个子排列多少次.
到目前为止,我所拥有的是一个矢量化代码,用于检查M中是否包含给定的单个子置换(以及多少次),但是我必须使用通过N的parfor-loop,这对于非常大的N来说会变慢.注意,如果N不是太大,人们也可以通过简单地从给定的3元组构造允许的5元组并将结果与M进行比较来解决问题,但是如果N可以很快变得比简单的强制要慢得多.足够大了.
查看问题的另一种方法如下:检查其行的N个模数排列是否是一般意义上的M的子矩阵,即,是否可以通过删除M中的元素来获得N行的排列. .
抱歉,如果我的问题太基础,我的背景来自算术代数几何和表示理论,我对MATLAB很新.
编辑:
这是我检查M中单个k元组的存在的代码:
[码]
function [A,f] = my_function(x,M) %// returns all rows in M that contain x and the absolute frequency of x in M %// suboptimal for checking combinations rather than permutations byy at least ~ 50% k = size(x,2); m = size(M,1); R = zeros(m,k); I = R; Z = I; for j = 1:k [R(:,j),I(:,j)] = max((M == x(j)),[],2); Z(:,j) = R(:,j).*I(:,j); end z = zeros(m,k-1); for j = 1:(k-1) z(:,j) = (Z(:,j) > 0 & Z(:,j) < Z(:,j+1)); end [v,~] = find(sum(z,2) == k-1); A = M(v,:); f = length(v); end
使用此函数,通过N检查只是一个简单的(par)for循环问题,我希望避免使用更快的矢量化解决方案.
方法#1[val,ind] = max(bsxfun(@eq,permute(M,[4 2 1 3]),permute(N,[2 3 4 1])),[],2) matches = squeeze(all(diff(ind,1)>0,1).*all(val,1)) out1 = any(matches,2) %// Solution - 1 out2 = sum(matches,1) %// Solution - 2
方法#2
避免置换N的另一种方法可能对于longish N更好 –
[val,ind] = max(bsxfun(@eq,N,permute(M,[3 4 1 2])),[],4) matches = squeeze(all(diff(ind,[],2)>0,2).*all(val,2)) out1 = any(matches,1) %// Solution - 1 out2 = sum(matches,2) %// Solution - 2
方法#3
用于大型数据的内存scroogey方法 –
out1 = false(size(M,1),1); %// Storage for Solution - 1 out2 = zeros(size(N,1),1); %// Storage for Solution - 2 for k=1:size(N,1) [val3,ind3] = max(bsxfun(@eq,N(k,:),permute(M,[1 3 2])),[],3); matches = all(diff(ind3,[],2)>0,2).*all(val3,2); out1 = or(out1,matches); out2(k) = sum(matches); end
方法#4
GPU的内存scroogey方法 –
gM = gpuArray(M); gN = gpuArray(N); gout1 = false(size(gM,1),1,'gpuArray'); %// GPU Storage for Solution - 1 gout2 = zeros(size(gN,1),1,'gpuArray'); %// GPU Storage for Solution - 2 for k=1:size(gN,1) [val3,ind3] = max(bsxfun(@eq,gN(k,:),permute(gM,[1 3 2])),[],3); matches = all(diff(ind3,[],2)>0,2).*all(val3,2); gout1 = or(gout1,matches); gout2(k) = sum(matches); end out1 = gather(gout1); %// Solution - 1 out2 = gather(gout2); %// Solution - 2
现在,这种GPU方法已经吹走了所有其他方法.它使用M:320000X5和N:2100X3(与输入大小相同)运行,填充随机整数.使用GTX 750 Ti,仅用了13.867873秒!因此,如果您拥有足够内存的GPU,这也可能是您的赢家方法.
方法#5
适用于GPU的极其内存scroogey方法 –
gM = gpuArray(M); gN = gpuArray(N); gout1 = false(size(gM,1),1,'gpuArray'); %// GPU Storage for Solution - 1 gout2 = zeros(size(gN,1),1,'gpuArray'); %// GPU Storage for Solution - 2 for k=1:size(gN,1) [val2,ind2] = max(bsxfun(@eq,gM,permute(gN(k,:),[1 3 2])),[],2); matches = all(diff(ind2,[],3)>0,3).*all(val2,3); gout1 = or(gout1,matches); gout2(k) = sum(matches); end out1 = gather(gout1); %// Solution - 1 out2 = gather(gout2); %// Solution - 2