什么是通过多个键进行分组的最pythonic方法,并在Python中汇总字典列表的平均值假设我有一个字典列表如下
input [
{dept: 001, sku: foo, transId: uniqueId1, qty: 100},
{dept: 001, sku: bar, transId: uniqueId2, qty: 200},
{dept: 001, sku: foo, transId: uniqueId3, qty: 300},
{dept: 002, sku: baz, transId: uniqueId4, qty: 400},
{dept: 002, sku: baz, transId: uniqueId5, qty: 500},
{dept: 002, sku: qux, transId: uniqueId6, qty: 600},
{dept: 003, sku: foo, transId: uniqueId7, qty: 700}
]
期望的聚合输出
output[
{dept: 001, sku: foo, qty: 400},
{dept: 001, sku: bar, qty: 200},
{dept: 002, sku: baz, qty: 900},
{dept: 002, sku: qux, qty: 600},
{dept: 003, sku: foo, qty: 700}
]
或平均
output[
{dept: 001, sku: foo, avg: 200},
{dept: 001, sku: bar, avg: 200},
{dept: 002, sku: baz, avg: 450},
{dept: 002, sku: qux, avg: 600},
{dept: 003, sku: foo, avg: 700}
]
解决方法:
获得汇总结果
from itertools import groupby
from operator import itemgetter
grouper itemgetter("dept", "sku")
result []
for key, grp in groupby(sorted(input_data, key grouper), grouper):
temp_dict dict(zip(["dept", "sku"], key))
temp_dict["qty"] sum(item["qty"] for item in grp)
result.append(temp_dict)
from pprint import pprint
pprint(result)
产量
[{dept: 001, qty: 200, sku: bar},
{dept: 001, qty: 400, sku: foo},
{dept: 002, qty: 900, sku: baz},
{dept: 002, qty: 600, sku: qux},
{dept: 003, qty: 700, sku: foo}]
要获得平均值,您可以简单地更改for循环内的内容,就像这样
temp_dict dict(zip(["dept", "sku"], key))
temp_list [item["qty"] for item in grp]
temp_dict["avg"] sum(temp_list) / len(temp_list)
result.append(temp_dict)
产量
[{avg: 200, dept: 001, sku: bar},
{avg: 200, dept: 001, sku: foo},
{avg: 450, dept: 002, sku: baz},
{avg: 600, dept: 002, sku: qux},
{avg: 700, dept: 003, sku: foo}]
建议无论如何,我会在这样的dict中添加qty和avg
temp_dict dict(zip(["dept", "sku"], key))
temp_list [item["qty"] for item in grp]
temp_dict["qty"] sum(temp_list)
temp_dict["avg"] temp_dict["qty"] / len(temp_list)
result.append(temp_dict)
产量
[{avg: 200, dept: 001, qty: 200, sku: bar},
{avg: 200, dept: 001, qty: 400, sku: foo},
{avg: 450, dept: 002, qty: 900, sku: baz},
{avg: 600, dept: 002, qty: 600, sku: qux},
{avg: 700, dept: 003, qty: 700, sku: foo}]
标签python,list,dictionary