I need to make a dictionary from a large list of decrypts, to remove all duplicate dicts
Input list is something like:
input = [{'id': 1, 'value1': 'value1', 'value2': 'value2'}, {'id': 2 , 'Value1': 'value1', 'value2': 'value2'}, {'id': 2, 'value1': 'value1', 'value3': 'value4'}]
And I want to make a dictionary like this, using the "id" value as the key for a new dictionary:
output = {1: [{'id': 1, 'value1': 'value1', 'value2': 'value2'}] 2: [{'id': 2, 'value1': 'value1', 'value2': 'value My first try was to: <2> '' id ': 2,' value1 ':' value1 ',' value3 ':' value4 '}]}
Pre = output = {} for these inputs: If L ['id'] is not in the output or L is not in the output [L ['id']]: Output.SetDefault (L ['id'] , []) .Andand (L)
And it actually works but it is very slow, the lane (input) is about 20k / 30k items
< P> Is there any way to make it a bit quicker?Thank you!
Use separate set
to track the observed dictionaries; You must change them to a pre-eminent representation:
seen = set () drepr = lambda d: tuple (sorted (d.items ())) output = {} For el in Input: if not seen drepr (L): output.setdefault (el ['id'], []) .andand (l) seen.ed (drepr (el))
You can slow it down by using it because one can see a method and to call the stack frame without any inventory:
import from the archive Default D = shown (set)) Drepr = lambda d: tuple (sorted (d.items ()) input = defaultdict (list) L for input: if drepr (el) not seen: output [el ['id ']]. Append (el) Seen.add (drepr (el))
Demo:
& gt; & Gt; & Gt; Input = [{'id': 1, 'value1': 'value1', 'value2': 'value2'}, {'id': 2, 'value1': 'value1', 'value2': 'value2'} , {'Id': 2, 'value1': 'value1', 'value3': 'value4'}]> gt; & Gt; & Gt; Seen = set ()> & gt; & Gt; & Gt; Drepr = lambda d: Tupl (Sorted (D. (Articles))) & Gt; & Gt; Output = {}> gt; & Gt; & Gt; For the input in L: ... if not seen drepr (el): ... output.setdefault (el ['id'], []). Append (el) ... seen.add (drepr (el)) ... & gt; & Gt; & Gt; Pprint import from pprint & gt; & Gt; & Gt; Pprint {Output} {{: {{id: 1, 'value1': 'value1', 'value2': 'value2'}], 2: [{'id': 2, 'value1': 'value1', ' Value2 ':' value2 '}, {' id ': 2,' value1 ':' value1 ',' value3 ':' value4 '}]}
Comments
Post a Comment