Access to training data is enough to protect the four essential freedoms
Nobody's still seriously asking for a "purely idealistic approach" but the OSI keep trotting out this strawman. We have settled on a compromise that allows both open licensed and publicly accessible data, but rejects proprietary data available for a fee (e.g., NYT articles, Adobe stock photos) and inaccessible data (e.g., Facebook social graph) as these prevent you from studying or modifying the model, while exposing you to legal, security, and other risks.
In any case, the OSI admits they are not competent to answer the question in outsourcing it: https://samjohnston.org/2024/10/15/the-osi-lacks-competence-to-define-open-source-ai/