Applying natural language processing to the domain of financial news requires robust methods that process all sentences correctly, including those that are negated. So far, related research commonly utilizes rule-based algorithms to detect negated sentence fragments, named negation scopes. Nonetheless, these methods involve certain limitations when encountering complex language or particularities of the chosen prose. As an alternative, reinforcement learning offers an opportunity to learn suitable negation classifications through trial-and-error experience. This method tries to replicate human-like learning and thus appears well-suited for natural language processing. Its episode-based and flexible structure allows for the handling of even highly complex sentences. Our results provide evidence that reinforcement learning can outperform rule-based approaches from the related literature. The best performing implementation reveals a predictive accuracy of up to 76.37% on a manually-labeled dataset, exceeding the predictive accuracy of rule-based approaches by 2.55 %. When utilizing the already trained reinforcement learning implementation for sentiment analysis, we find a potential subjectivity bias that limits the predictive performance of forecasting stock market returns.