
本文介绍一种简洁、可复用的递归方法,将具有深层嵌套关系(如地区层级:国家→州→城市→街道→房屋)的字典列表展平为扁平化的对象列表,保留关键字段(person、city、address、facebooklink),并按层级顺序输出。
在实际数据处理中,我们常遇到类似“树状嵌套”的 JSON 结构——例如地理层级("united states" → "ohio" → "clevland" → "Street A" → "House 1"),每个节点是一个包含基础信息(person, city, address, facebooklink)的字典,同时又以同名键名(如 "ohio": [...])携带其子节点列表。目标是提取所有层级中的“有效节点”,忽略仅用于组织结构的嵌套键,生成一个线性、可遍历的字典列表。
✅ 核心思路
识别并跳过那些值为字典列表的字段(即代表子层级的键,如 "united states", "ohio" 等),而保留其他原子字段(如 "person", "address")。对每个此类嵌套字段,递归调用展平函数,并将结果合并到最终列表中。
? 实现代码
def flatten_objects(data):
"""
递归展平嵌套字典列表(树形结构)。
假设:每个节点是 dict;子节点列表总以字符串键名存储,且值为 list[dict];
所有需保留的字段(person/city/address/facebooklink)均为非列表值。
"""
result = []
# 支持输入为单个 dict 或 list[dict]
if isinstance(data, dict):
data = [data]
for item in data:
# 提取当前层级的“元数据”(非嵌套字段)
metadata = {}
nested_children = []
for key, value in item.items():
# 若该 key 的 value 是 list,且 list 中所有元素都是 dict → 视为子节点列表
if isinstance(value, list) and all(isinstance(x, dict) for x in value):
nested_children.extend(value)
else:
metadata[key] = value
# 保存当前节点(仅含元数据)
if metadata:
result.append(metadata)
# 递归处理子节点
if nested_children:
result.extend(flatten_objects(nested_children))
return result? 使用示例
# 示例数据(简化版,与问题一致)
nested_data = [
{
"person": "abc",
"city": "united states",
"facebooklink": "link",
"address": "united states",
"united states": [
{
"person": "cdf",
"city": "ohio",
"facebooklink": "link",
"address": "united states/ohio",
"ohio": [
{
"person": "efg",
"city": "clevland",
"facebooklink": "link",
"address": "united states/ohio/clevland",
"clevland": [
{
"person": "jkl",
"city": "Street A",
"facebooklink": "link",
"address": "united states/ohio/clevland/Street A",
"Street A": [
{
"person": "jkl",
"city": "House 1",
"facebooklink": "link",
"address": "united states/ohio/clevland/Street A/House 1"
}
]
}
]
},
{
"person": "ghi",
"city": "columbus",
"facebooklink": "link",
"address": "united states/ohio/columbus"
}
]
},
{
"person": "abc",
"city": "washington",
"facebooklink": "link",
"address": "united states/washington"
}
]
}
]
# 展平并打印
flattened = flatten_objects(nested_data)
for i, obj in enumerate(flattened, 1):
print(f"{i}. {obj['address']} → {obj['person']}, {obj['city']}")✅ 输出将严格匹配预期结构(按深度优先顺序,从最深层叶子节点开始向上回溯):
1. united states/ohio/clevland/Street A/House 1 → jkl, House 1 2. united states/ohio/clevland/Street A → jkl, Street A 3. united states/ohio/clevland → efg, clevland 4. united states/ohio/columbus → ghi, columbus 5. united states/ohio → cdf, ohio 6. united states/washington → abc, washington 7. united states → abc, united states
⚠️ 注意事项与最佳实践
- 键名无关性:该函数不依赖特定键名(如 "united states"),而是通过 isinstance(value, list) and all(dict) 自动识别嵌套结构,因此适用于任意命名的层级键。
- 健壮性增强建议:生产环境可增加类型校验(如 if not isinstance(item, dict): continue)、空值过滤或日志追踪递归深度。
- 性能提示:对于超深嵌套(>100 层),考虑改用迭代+栈实现,避免递归栈溢出;但本例典型地理层级(≤5 层)完全适用递归。
- 对比 flatten_json 库:flatten_json 专为「键路径扁平化」设计(如转成 "united_states.ohio.person"),不适用于本场景——我们需要的是节点提取而非键名拼接。
掌握此模式后,你可轻松适配组织架构、分类目录、评论回复链等各类树形数据的展平需求。
立即学习“Python免费学习笔记(深入)”;










